user input information

The user inputs needed to analyze open field exploration data using the opynfield package are divided into five groups based on the aspects of analysis that the settings control, and based on the likelihood that a user would need to adjust these settings from their default values.

User Inputs

The UserInput dataclass contains all of the settings that are necessary to read the tracking data into the opynfield Track format, as well as settings that must be changed for each analysis.

  1. groups_and_types contains the names of the groups that you will be analyzing as well as the filetypes that each group was recorded in. It is a dictionary where the keys are the group names (strings) and the values are a list of filetypes (strings). For example, if you have two groups and their tracks were recorded using both Buridian Tracker and Ethovision, your groups_and_types should be: {‘GroupA’: [“Buridian Tracker”, “Ethovision Excel Version 2”], ‘GroupB’: [“Buridian Tracker”, “Ethovision Excel Version 2”]}. Valid filetypes include: ‘Buridian Tracker’, ‘Ethovision Excel Version 1’, ‘Ethovision Excel Version 2’, ‘Ethovision Text’, ‘Ethovision Through MATLAB’, ‘AnyMaze Center’, and ‘AnyMaze Head’. If your experiment includes track types ‘Ethovision Excel Version 1’, ‘Ethovision Excel Version 2’, or ‘Ethovision Text’, then the group names you provide must match the group names that are recorded in the Ethovision data sheets.

  2. groups_to_paths contains the names of the groups that you provided in groups_and_types as well as alternate group names that exclude any special characters. This is needed so that we are able to save results with the group name in the file name. It is a dictionary where the keys are the original group names (strings), and the values are the alternate group names (strings) For example if your groups are names by geneotype, you must remove and / characters like {‘w-’: ‘white’, ‘w-/+’: ‘heterozygote’, ‘+/+’: ‘wildtype’}. If your group names have no special characters, you can just provide the same group names (e.g. {‘GroupA’: ‘GroupA’, ‘GroupB’: ‘GroupB’}).

  3. arena_radius_cm contains the radius of the arena in which the tracks were recorded. It is a float. This is needed for Buridian Tracker and Anymaze Tracker filetypes, which record animal position in pixels rather than centimeters. Right now this is a global parameter (i.e. you cannot read in tracks with two different arena sizes at once).

  4. sample_freq contains the recording frame rate (in Hz) for the tracks. It is an integer. Right now this is a global parameter (i.e. you cannot read in tracks with two different frame rates at once).

  5. edge_dist_cm contains the distance from the boundary of the arena which should be considered the edge region. It is a float. With Drosophila we have often used 0.5cm or more commonly 1cm. With other taxa it may be necessary to try several different values and see which ensures the animals spend enough time in the arena edge. Right now this is a global parameter (i.e. you cannot read in tracks with two different edge distances at once).

  6. time_bin_size contains the amount of time (in seconds) that should be aggregated into one point. It allows us to change the density of the data from sample_freq points per second to 1/time_bin_size samples per second. It is an interger.

  7. inactivity_threshold contains the threshold at which we should consider a movement too small to be an actual step by the animal, and instead attribute it to body wobble. It is a float.

  8. verbose indicates whether or not we want progress updates displayed as the analysis is running. It is a boolean.

  9. result_path contains the path to where we want to save the results of the analysis. It is a string.

  10. running_window_length is a parameter of the smoothing function that is implemented on the non-Ethovision recording types. The smoothing finction is essentiall a weighted running average, and running_window_length indicates how many points should be contained in the window for the average. It must be an odd integer and defaults to 5 in order to match the native Ethovision smoothing function.

  11. window_step_size is a parameter of the smoothing function that is implemented on the non-Ethovision recording types. The smoothing finction is essentiall a weighted running average, and window_step_size indicates how many points the averaging window should move forward between each average. It is an integer and defaults to 1 in order to match the native Ehtovision smoothing function.

  12. trim indicates how many recording points it takes for the aninal to be placed in the arena. It is important for Anymaze data in order to correctly impute the arena boundary. It is an interger and defaults to 0, but should be increased for Anymaze tracking data.

  13. bound_level indicates how many standard deviations away from the mean we should consider to be an outlier parameter value. It is a float and defualts to 2.0. (95% threshold)

Coverage Asymptote Settings

The CoverageAsymptote dataclass contains information on how to fit a time vs coverage model in order to calculate the asymptote coverage value for either an individual or a group.

  1. f_name is which functional form to use in fitting the time vs coverage model. It defaults to the fixed exponential model (y = a*(e^b*x-1))

  2. asymptote_param is which parameter of the model indicates the asymptote magnitude. It defaults to 0 (the first parameter, a in the fixed exponential model)

  3. asymptote_sign indicates the sign that the asymptote parameter is expected to be. It defaults to -1.

  4. initial_parameters provide initial parameter values to use when fitting the model, defaults to (-0.1, -0.1)

  5. parameter_bounds provide first order bounds to ensure that appropriately signed parameters are fit, defaults to ([-10, -10], [0, 0])

  6. max_f_eval indicates how many iterations are allowed before assuming non-convergence, defualts to 4000

Model Fit Settings

The ModelSpecification dataclass contains

  1. axes which x-measure and which y-measure are being fit

  2. model what model will be used to fit that x and y relationship

The model can be one of 4 options: ExponentialModel, FixedExponentialModel, LinearIncreaseModel, LinearDecreaseModel. Each of these in turn is a dataclass that contains information about how to fit that model.

  1. initial_params provide initial parameter values to use when fitting the model, defaults vary by model type

  2. bounds provied first order bounds to ensure that appropriately signed parameters are fit, defaults vary by model type

  3. max_eval indicates how many interactions are allowed before assuming non-convergence, defaults to 4000 in all model types

  4. display_parts provides string components that can be joined with the parameter fits to properly display the fit equation, defaults vary by model type

Plot Settings

The PlotSettings dataclass contains all of the settings that are necessary for the plots that are generated by opynfield.

  1. group_colors is the only mandatory input to PlotSettings. It dictates which colors should be used for which groups in the plots. It is a dictionary where the keys are the group names (strings) provided in the groups_and_types attribute of a UserInput instance, and the values are the color codes (strings) for which color to plot. For example if you want one group to be plotted in blue and one group to be plotted in green, you could provide {‘GroupA’: ‘b’, ‘GroupB’: ‘g’}

  2. marker_size is the size of the markers used in the scatter plots, defaults to 2.

  3. marker_color is the color that the markers should be in the individual scatter plots (in group comparison plots, color is determined by group_colors), defaults to ‘b’ for blue

  4. individual_model_fit indicates whether or not the model fit for individual plots should be displayed, defaults to True

  5. fit_color is the color that the individual model fit should be, if displayed, defaults to ‘k’ (black)

  6. alpha is the transparency level that the model fit for individuals should be displayed with, defaults to 0.3 (0 is completely transparent and 1 is completely opaque)

  7. group_error_bars indicates whether or not the error bars on group averages should be displayed, defaults to True

  8. error_color indicates that color group error bars on single group plots should be, defualts to ‘b’ (blue), in group comparison plots, color is determined by group_colors

  9. n_between_error indicates how many points to skip between displaying error bars, defaults to 1 (error bars are put on every marker)

  10. group_model_fit indicates whether or not the group model fit should be displayed, defaults to True

  11. equation indicates whether or not the equation of the model fit should be displayed on single group or individual plots, defaults to True

  12. display_individual_figures indicates whether the individual plots should be rendered, defaults to False

  13. save_individual_figures indicates whether the individual plots should be saved out, defaults to True

  14. display_solo_group_figures indicates whether the single group plots should be rendered, defaults to False

  15. save_solo_group_figures indicates whether the single group plots should be saved out, defaults to True

  16. save_combined_view_figures indicates whether the single group average and component individual plots should be saved out, defaults to True

  17. fig_extension provides what file format the plots should be saved in, defaults to .png

  18. colormap_name provides the color map scheme to use in the track trace time bar, defaults to ‘gist_rainbow’

  19. edge_color provides the color to use to plot the arena boundary in the track trace, defaults to ‘k’ (black)

  20. error_width indicates how thick the error bar should be in group plots, defaults to 0.5

  21. save_group__comparison_figures indicates whether the group comparison plots should be saved out, defaults to True

Default Settings

The Defaults dataclass contains the rest of the settings, that typically should not need to be changed by the user.

  1. node_size indicates the angle that defines the circle sector size to divide the arena edge into bins for coverage, defaults to 0.1

  2. save_group_csvs indicates whether or not to save a .csv file for each calculated measure for each group, defaults to True

  3. save_all_group_csvs indicates whether or not to save a .csv file for each calculated measure for all groups together, a helpful format to export for statistical tests in other programs, defaults to True, save_group_csvs must be True for save_all_group_csvs to be True

  4. save_group_model_csvs indicates whether or not to save a .csv file that includes the individuals’ fitted parameters for each group, defaults to True

  5. save_all_group_model_csvs indicates whether or not to save a .csv file that includes the individuals’ fitted parameters for all groups together, a helpful format to export for statistical tests in other programs, defaults to True

  6. n_points_coverage number of points to group together in an average for coverage, defaults to 36

  7. n_points_pica number of points to group together in an average for PICA, defaults to 36

  8. n_points_pgca number of points to group together in an average for PGCA, defaults to 36

  9. n_bins_percent_coverage number of bins to group together in an average for percent coverage, defaults to 10

  10. time_averaged_measures measures average by time, defaults to [“r”, “activity”, “p_plus_plus”, “p_plus_minus”, “p_plus_zero”, “p_zero_plus”, “p_zero_zero”, “coverage”, “percent_coverage”, “pica”, “pgca”, “p_plus_plus_given_plus”, “p_plus_minus_given_plus”, “p_plus_zero_given_plus”, “p_zero_plus_given_zero”, “p_zero_zero_given_zero”, “p_plus_plus_given_any”, “p_plus_minus_given_any”, “p_plus_zero_given_any”, “p_zero_plus_given_any”, “p_zero_zero_given_any”]

  11. coverage_averaged_measures measures to average by coverage, defaults to [“activity”, “p_plus_plus”, “p_plus_minus”, “p_plus_zero”, “p_zero_plus”, “p_zero_zero”, “p_plus_plus_given_plus”, “p_plus_minus_given_plus”, “p_plus_zero_given_plus”, “p_zero_plus_given_zero”, “p_zero_zero_given_zero”, “p_plus_plus_given_any”, “p_plus_minus_given_any”, “p_plus_zero_given_any”, “p_zero_plus_given_any”, “p_zero_zero_given_any”]