Data Analysis

Objective

To develop an analysis tool that can be used to characterize experimental datasets from tube flocculator experiments

Unknown macro: {composition-setup}

cloak.toggle.exclusive=false

Unknown macro: {toggle-cloak}
Methods

Unknown macro: {cloak}

Data source

The description of the experiments can be viewed in the previous reports. All of experimental data (i.e time, pressure, turbidity, flow rates, etc) was collected using Process Controller and saved in Excel files. Data of interest (i.e. effluent turbidity) was stored corresponding to the date of experiment run and the status file indicated the state of treatment process (i.e. flocculation state, settling state, etc).

The dataset was extracted using Meta Data algorithm and analyzed in the steps provided below.

Model fitting

In the settling state (state 4 or 5, depending on the experimental and software setup), the analysis of settling dataset starts from its maximum turbidity reading. This will eliminate the data fluctuation due to a sudden stop of flow. Data is normalized to the maximum reading of dataset and converted to a positive hyperbolic curve for data interpretation. The dataset will now read as the amount of settling in the tube. A hyperbolic curve can be linearized using double reciprocal method (or Lineweaver-Burke plot), where the Y-axis is the reciprocal of the cumulative amount of settling and X-axis is the terminal velocity. The distance of column above turbiditimeter is assumed to be 5 cm.

Data repeatability

The details of the experiment conducted to test data repeatability can be viewed here.

Equations

Assuming the dataset follows a hyperbolic curve function, therefore it can be represented as:

Equation 1

or it also can be represented as:

where
N is the effluent turbidity [-]
No is the maximum/initial effluent turbidity during settling state [-]
(N/N0)max is the maximum value that the hyperbolic function will asymptomatically approach to.
t is the time [T]
KS is the rate of settling flocs [T].

This simplification will enable the team to extract important parameters of the curve (i.e. KS and ) and to use these parameters in comparing with other curves.

In order to linearize the hyperbolic function, double reciprocal on both axes can be applied. By using terminal velocity as the reciprocal of time, we can quantify the data using a meaningful parameter instead of using frequency.

Terminal velocity is the velocity of the flocs settling in the column.

where
v is the terminal velocity [L/T2]
L is the distance of the column above turbidimeter [L]

t can be redefined as:

Equation 2

substitution of Equation (2) in Equation (1) yields:

Equation 3

Reciprocating both axes in Equation 3 gives:

Equation 4

Equations can also be viewed here.

Window average

In statistics, a moving average is used to analyze time series data.

Algorithm and equations of window average is attached in Algorithm MathCad file. Here, we specify the specific window for the program to take an average from.

For the experimental data that is less than number of window specify (at lower end), Equation 5a is used and Equation 5b is used at the upper end.

Equation 5a

Equation 5b

For any other experimental data in between these two condition, Equation 6 is used.
Equation 6

where
w is the number of window (odd number)
Y is the data
n is the length of experimental data
m is (w-1)/2

Mathcad files

Three Mathcad files:

  • Data Processor Function: This file extracts and sorts raw data into column of states.
    Algorithm: This file stores information on algorithm, such as double reciprocal method, window average and minor calculations.
    Settling Analysis : This is the main file and it contains sections for metafilter, plots, etc.
Unknown macro: {cloak}

Unknown macro: {toggle-cloak}
Results and Discussions

Unknown macro: {cloak}

Model fitting

By extracting the data using Meta Data files, we were able to choose any set of settling data of interest, plot and analyze them individually or collectively to see any if there is any trend to the data. At the beginning, we proposed polynomial fit as an option to evaluate the data and error analysis algorithm was utilized to see which equations fit each datasets the best. From sum squared error analysis, the third and fourth degree of polynomial were found to be the best.

The limitation of using polynomial fit in the data analysis is to specify any parameter that would be responsible for every dataset. Each time a dataset was modeled, the polynomial coefficients were changing and we couldn't find any trend that best described the settling data. It is also hard to fix one n th degree of polynomial for the entire set because each individual dataset gives different error analysis.

A hyperbolic fit was proposed and a sample of experimental dataset was normalized to its maximum turbidity reading to determine the percentage of flocs that has been settled over a period of time. Lineweaver-Burke or double reciprocal method was applied.

We analyzed one set of data from iterating flow rate experiment (50 ft tube flocculator, 50NTU initial turbidity and 25 mg/L alum), which is the result from flow rate of 2.95 mL/s for 600s.

Figure 1 shows the raw data of settling state normally that is normally obtained in the experiment. The initial drop in every curve represents the bigger flocs that drop faster than the smaller flocs. On the other hand, the tailing of the curve represents the settling of the smaller particles in the settling column. Ideally, the drop should be very steep and the tailing should approach 0. The raw data is normalized in order to compare the percentage of turbidity drop with other sets of data in the experiments, since the initial turbidity at the inlet could fluctuate. The turbidity drop of 5NTU in one run would not mean the same as the drop in the other run. The normalization would also enable the team to look at the efficiency of the tube flocculator.


Figure 1. Typical raw data in the settling column

As previously mentioned, the experimental data was flipped on horizontal axis to indicate a "positive growth". Note that the data now is translated as the amount of particles that settle.

In order to quantify which parameter that best describes the data, we linearized the data by reciprocate both of the axes. The time axis (x-axis) was transformed into velocity by dividing the length of the tube above turbidimeter to the time. The result of the linearization is illustrated in Figure 2 and Figure 3.


Figure 2. Best fit linearization using double reciprocal method.


Figure 3. The parameters from linearization was substituted into Equation 1.

The R2 value obtained from the linearization was low (0.622) for this set of data considering the fluctuation in the raw data and the effects of reciprocation, which amplify the lower end and compress the upper end of the original data (x-axis). This also will result in amplification of error of the fluctuated data.

In this example, we did data smoothing with frame window of 33 (w=33). Moving average managed to increase the R2 value to 0.870. However, we can see that the Ks and N/N0 didn't have any improvement.


Figure 4. Raw data and its moving average of 33 (red line). The best fit from linearization is shown here (yellow line).

Table 1. The parameters obtained from linearization of iterating flow rates experimental data.

Flow rates

Ks

(N/N0)max

R2

Ks 1

(N/N0)max 1

R2

1.3

0.001563

0.952

0.456

0.001417

0.934

0.709

1.45

0.0004205

0.743

0.456

0.0003706

0.74

0.798

1.6

0.0005225

0.828

0.647

0.000584

0.848

0.904

1.75

0.0001045

0.666

0.233

0.0001108

0.687

0.601

1.9

0.0005999

0.904

0.775

0.0006628

0.923

0.952

2.05

0.0006199

0.87

0.739

0.0007027

0.896

0.868

2.2

0.000774

0.91

0.545

0.0008465

0.932

0.638

2.5

0.001861

1.321

0.309

0.0008398

0.965

0.585

2.65

0.001861

1.321

0.309

0.0008398

0.965

0.585

2.8

0.0007028

0.854

0.654

0.0007923

0.878

0.792

2.95

0.0002372

0.823

0.668

0.0002776

0.839

0.893

3.1

0.0007587

0.798

0.529

0.0006515

0.793

0.929

1 Window average of 33 was applied in the determination of these parameters.

Data repeatability

The results and discussion of data fluctuation for previous experimental setup can be viewed here.

In the previous setup, the average Ks value derived from 10 repeat runs was 0.00391 and the standard deviation was 0.00247. After moving average was applied, the average was 0.00353 and the standard deviation was 0.00241. Both results yielded high standard deviation value. The (N/N0)max values for without and with moving average were 0.6121 and 0.6123, respectively and their standard deviations were 0.293 and 0.294, respectively.

Table 2 The average values of KS and (N/N0)max and their corresponding standard deviation for old and new experimental setups.

Setup

KS

Std dev.

KS 1

Std dev.

(N/N0)max

Std dev.

(N/N0)max 1

Std dev.

Old

0.00391

0.00247

0.00352

0.00241

0.6121

0.293

0.6123

0.294

New

0.00173

0.000975

0.00199

0.00131

1.1243

0.193

1.2135

0.294

1 moving average of 33 was applied.

Unknown macro: {cloak}

Unknown macro: {toggle-cloak}
Progress

Unknown macro: {cloak}

Apart from analyzing past datasets, the team also interested to analyze datasets from new setup, which addresses some of the problems encountered in the previous setup. The new setup can be viewed here and the analysis of experimental runs can be viewed here.

Unknown macro: {cloak}

Back to Tube Floc Home Page

  • No labels