QC Files I use:
- Flowcell_QC_stats.txt- I don't usually use, it blasts reads against some ApeKI Maize and looks at read length, but only works for Maize / ApeKI
- Flowcell_QC1.txt- I use this summary table, but would like to change it a lot.
- Flowcell_QC2.txt- This has the reads per sample, very useful.
- Flowcell_adapter_dimers.txt- not too useful for me because too big of a file
- Flowcell_adapter_dimers_parsed.txt- Lists number of dimers per barcode. I like it!
Questions:
- Does it keep junk reads somewhere?
Changes to Flowcell_QC1.txt file:
- Can we have it write to the same file each time, just append to bottom (may need to be by enzyme)?
- I would like to change the analysis to base it on a 96-well plate, not a lane.
- Fields I need (bold is most important):
- Date (not in script)
- Flowcell (not in script)
- Lane (Flowcell lane #)
- Plate Name (Currently just gives one plate if 384-plex, I'd like to do the analysis by plate)
- Read_Count (Total No of reads)
- Pass_Count (No of reads that have cut site and barcode)
- Pass_Rate (% of reads that have cut site and barcode)
- Pass_Filter_Count (No of reads that pass Illumina's filter)
- Pass_Filter_Rate (% of reads that pass Illumina's filter)
- Adapter_dimers
- N_samples (samples on plate, not including blanks)
- Mean_count (Pass_Filter_Count/n_samples)
- SD_count (std dev of mean_count, we could drop this)
- CV_count_(%) (CV of mean_count)
- No_Blanks (not in script, number of blanks on a plate)
Blank_Min (not in script, for each blank, calculates Pass_count/Mean_count [% of mean reads], then returns the lowest value)
Blank_Max (not in script, for each blank, calculates Pass_count/Mean_count [% of mean reads], then returns the maximum value)