rna_library.processing.row_utils

Utility functions for transforming rows in a reactivity dataframe

Module Contents

Functions

add_reactivity(row, output_directory, col_name = 'mismatches')

Function that gets the raw reactivities for a given construct

block_commons(row, start, end)

Function that blocks off the common start and sequence of an RNA construct with N’s

score(row)

Function that generates a dsci score for an RNA and its reactivity values. Value is on

signal_to_noise(row)

Function that calculates the signal to noise ratio for a DMS entry by using the

num_reads(row, histos)

Function that finds the number of reads for a given DMS entry row.

collect_junction_entries(m, reactivity, construct, sn, reads, score, holder)

Utility function that gets all JunctionEntry objects across a reactivity dataframe.

row_normalize_hairpin(row, norm_seq, norm_ss, factor, nts)

Function that performs a hairpin normalization on a pd.Series representing a construct.

rna_library.processing.row_utils.add_reactivity(row, output_directory, col_name='mismatches')

Function that gets the raw reactivities for a given construct

Param

pandas.Series row: row to get reacitivity for

Param

str output_directory : the base directory from which to get the population files :param: str col_name: column name to get values from, defaults to ‘mismatches’ :rtype: List[float]

Parameters
  • row (pandas.Series) –

  • output_directory (str) –

  • col_name (str) –

Return type

List[float]

rna_library.processing.row_utils.block_commons(row, start, end)

Function that blocks off the common start and sequence of an RNA construct with N’s

Param

pandas.Series row: row to block off commons for :param: str start: the common start sequence :param: str end: the common end sequence :rtype: str

Parameters
  • row (pandas.Series) –

  • start (str) –

  • end (str) –

Return type

str

rna_library.processing.row_utils.score(row)

Function that generates a dsci score for an RNA and its reactivity values. Value is on the range [0,1] with 0.95 being a common quality cutoff.

param

pandas.Series row: row to get a score for

Return type

float

Parameters

row (pandas.Series) –

rna_library.processing.row_utils.signal_to_noise(row)

Function that calculates the signal to noise ratio for a DMS entry by using the ratio of mutations for (A + C)/(G + U).

param

pandas.Series row: row to get sn ratio for

Return type

float

Parameters

row (pandas.Series) –

rna_library.processing.row_utils.num_reads(row, histos)

Function that finds the number of reads for a given DMS entry row.

Param

pandas.Series row: row to get the number of reads for

Param

Dict[str,dreem.MutationHistogram] histos: histogram dictionary that has read information

Parameters
  • row (pandas.Series) –

  • histos (Dict[str, any]) –

Return type

int

:rytype; int

rna_library.processing.row_utils.collect_junction_entries(m, reactivity, construct, sn, reads, score, holder)

Utility function that gets all JunctionEntry objects across a reactivity dataframe.

Param

Motif m: the base motif to get data for

Param

List[int] reactivity: list of reactivity values for the construct

Param

str construct: name of the construct

Param

float sn: signal to noise ratio of the construct

Param

int reads: number of sequencer reads for the construct

Param

float score: DSCI score for the construct :param: Dict[str,List[JunctionEntry]] holder: temporary holder for all of the JunctionEntry objects

Parameters
Return type

None

rna_library.processing.row_utils.row_normalize_hairpin(row, norm_seq, norm_ss, factor, nts)

Function that performs a hairpin normalization on a pd.Series representing a construct. Returns the normalized reactivity series.

Param

pd.Series row: dataframe row describing a construct. must have ‘RNA’, ‘structure’ and ‘reactivity’ columns

Param

str norm_seq: normalization hairpin sequence

Param

str norm_ss: normalize hairpin secondary structure

Param

float factor: factor to which the reference value will be set

Param

List[str] nts: nucleotides to be considered in the normalization scheme. must be unpaired!

Return type

List[float]

Parameters
  • row (pandas.Series) –

  • norm_seq (str) –

  • norm_ss (str) –

  • factor (float) –

  • nts (List[str]) –