Pandas DataFrames and Plotly to display csv data

About

Pandas DataFrames are a useful tool for data analysis. They can be used to help read in patient radiation dose data for further analysis. The DataFrame can then be used as the data source for Plotly charts. This page provides a simple example of this functionality. Version 1.0 of the OpenREM patient dose management system makes use of Pandas DataFrames and Plotly for its built-in charting features.

Import csv data into a DataFrame

            import pandas as pd
            import plotly.express as px

            # Read the contents of the testDataLarge.csv file into a Pandas DataFrame
            df = pd.read_csv("testDataLarge.csv")

            # Show the DataFrame column headings
            df.columns
          

This results in the following output:


Index(['Institution',
       'Manufacturer',
       'Model name',
       'Station name',
       'Study date',
       'Study time',
       'Age',
       'Sex',
       'Study description',
       'kVp',
       'mAs',
       'mA',
       'Exposure time (ms)',
       'Exposure index',
       'Target exposure index',
       'Deviation index',
       'DAP (cGy.cm^2)'],
      dtype='object')
          

Create a boxplot of DAP for each study description for each x-ray system

The contents of the DataFrame can now be used as a data source to create Plotly charts.

            # Create a Plotly figure
            fig = px.box(
                df,                     # The source of data for the chart
                x="Study description",  # The DataFrame column name to use for x-axis categories
                y="DAP (cGy.cm^2)",     # The DataFrame column name to use for y-axis values
                color="Model name",     # The DataFrame column name to use for each series
            )

            # Write the figure to a self-contained html file
            fig.write_html("dapBox.html", include_plotlyjs="cdn",)
          

The include_plotlyjs="cdn" code makes the saved figure retrieve the plotly.min.js file from the internet, rather than embedding it in the file. This reduces the file size by around 3 MB. See the Plotly documentationhere.

The resulting interactive chart

Filtering the data

The data can be filtered to just include certain models or study descriptions, as per this stackoverflow post.

This chart also has a custom title (title="xxx"), and I have specified a Plotly style template - see the Plotly documentation here.

            # Just data from mobiles
            df_just_mobiles = df[df["Model name"].str.contains("mobile")]

            # Just mobile chests
            fig = px.box(
                df_just_mobiles[df_just_mobiles["Study description"].str.contains("chest")],
                x="Study description",
                y="DAP (cGy.cm^2)",
                color="Model name",
                title="Mobile chest x-ray DAP",  # Provide a chart title
                template="presentation",         # Specify a Plotly style template to use
            )
            fig.write_html("dapMobileChestBox.html", include_plotlyjs="cdn",)
          

The resulting interactive chart