PhysiCell Tools : PhysiCell-povwriter

As PhysiCell matures, we are starting to turn our attention to better training materials and an ecosystem of open source PhysiCell tools. PhysiCell-povwriter is is designed to help transform your 3-D simulation results into 3-D visualizations like this one:

PhysiCell-povwriter transforms simulation snapshots into 3-D scenes that can be rendered into still images using POV-ray: an open source software package that uses raytracing to mimic the path of light from a source of illumination to a single viewpoint (a camera or an eye). The result is a beautifully rendered scene (at any resolution you choose) with very nice shading and lighting.

If you repeat this on many simulation snapshots, you can create an animation of your work.

What you’ll need

This workflow is entirely based on open source software:

• 3-D simulation data (likely stored in ./output from your project)
• PhysiCell-povwriter, available on GitHub at
• POV-ray, available at
• ImageMagick (optional, for image file conversions)
• mencoder (optional, for making compressed movies)

Setup

Building PhysiCell-povwriter

After you clone PhysiCell-povwriter or download its source from a release, you’ll need to compile it. In the project’s root directory, compile the project by:

make


(If you need to set up a C++ PhysiCell development environment, click here for OSX or here for Windows.)

Next, copy povwriter (povwriter.exe in Windows) to either the root directory of your PhysiCell project, or somewhere in your path. Copy ./config/povwriter-settings.xml to the ./config directory of your PhysiCell project.

Editing resolutions in POV-ray

PhysiCell-povwriter is intended for creating “square” images, but POV-ray does not have any pre-created square rendering resolutions out-of-the-box. However, this is straightforward to fix.

1. Open POV-Ray
2. Go to the “tools” menu and select “edit resolution INI file”
3. At the top of the INI file (which opens for editing in POV-ray), make a new profile:
[1080x1080, AA]
Width=480
Height=480
Antialias=On


4. Make similar profiles (with unique names) to suit your preferences. I suggest one at 480×480 (as a fast preview), another at 2160×2160, and another at 5000×5000 (because they will be absurdly high resolution). For example:
[2160x2160 no AA]
Width=2160
Height=2160
Antialias=Off


You can optionally make more profiles with antialiasing on (which provides some smoothing for areas of high detail), but you’re probably better off just rendering without antialiasing at higher resolutions and the scaling the image down as needed. Also, rendering without antialiasing will be faster.

5. Once done making profiles, save and exit POV-Ray.
6. The next time you open POV-Ray, your new resolution profiles will be available in the lefthand dropdown box.

Configuring PhysiCell-povwriter

Once you have copied povwriter-settings.xml to your project’s config file, open it in a text editor. Below, we’ll show the different settings.

Camera settings

<camera>
<distance_from_origin units="micron">1500</distance_from_origin>
<xy_angle>3.92699081699</xy_angle> <!-- 5*pi/4 -->
<yz_angle>1.0471975512</yz_angle> <!-- pi/3 -->
</camera>


For simplicity, PhysiCell-POVray (currently) always aims the camera towards the origin (0,0,0), with “up” towards the positive z-axis. distance_from_origin sets how far the camera is placed from the origin. xy_angle sets the angle $$\theta$$ from the positive x-axis in the xy-plane. yz_angle sets the angle $$\phi$$ from the positive z-axis in the yz-plane. Both angles are in radians.

Options

<options>
<use_standard_colors>true</use_standard_colors>
<nuclear_offset units="micron">0.1</nuclear_offset>
<cell_bound units="micron">750</cell_bound>
</options>


use_standard_colors (if set to true) uses a built-in “paint-by-numbers” color scheme, where each cell type (identified with an integer) gets XML-defined colors for live, apoptotic, and dead cells. More on this below. If use_standard_colors is set to false, then PhysiCell-povwriter uses the my_pigment_and_finish_function in ./custom_modules/povwriter.cpp to color cells.

The nuclear_offset is a small additional height given to nuclei when cropping to avoid visual artifacts when rendering (which can cause some “tearing” or “bleeding” between the rendered nucleus and cytoplasm). cell_bound is used for leaving some cells out of bound: any cell with |x|, |y|, or |z| exceeding cell_bound will not be rendered. threads is used for parallelizing on multicore processors; note that it only speeds up povwriter if you are converting multiple PhysiCell outputs to povray files.

Save

<save> <!-- done -->
<folder>output</folder> <!-- use . for root -->
<filebase>output</filebase>
<time_index>3696</time_index>
</save>


Use folder to tell PhysiCell-povwriter where the data files are stored. Use filebase to tell how the outputs are named. Typically, they have the form output########_cells_physicell.mat; in this case, the filebase is output. Lastly, use time_index to set the output number. For example if your file is output00000182_cells_physicell.mat, then filebase = output and time_index = 182.

Below, we’ll see how to specify ranges of indices at the command line, which would supersede the time_index given here in the XML.

Clipping planes

PhysiCell-povwriter uses clipping planes to help create cutaway views of the simulations. By default, 3 clipping planes are used to cut out an octant of the viewing area.

Recall that a plane can be defined by its normal vector and a point p on the plane. With these, the plane can be defined as all points satisfying

$\left( \vec{x} -\vec{p} \right) \cdot \vec{n} = 0$

These are then written out as a plane equation

$a x + by + cz + d = 0,$

where

$(a,b,c) = \vec{n} \hspace{.5in} \textrm{ and } \hspace{0.5in} d = \: – \vec{n} \cdot \vec{p}.$

As of Version 1.0.0, we are having some difficulties with clipping planes that do not pass through the origin (0,0,0), for which $$d = 0$$.

In the config file, these planes are written as $$(a,b,c,d)$$:

<clipping_planes> <!-- done -->
<clipping_plane>0,-1,0,0</clipping_plane>
<clipping_plane>-1,0,0,0</clipping_plane>
<clipping_plane>0,0,1,0</clipping_plane>
</clipping_planes>


Note that cells “behind” the plane (where $$( \vec{x} – \vec{p} ) \cdot \vec{n} \le 0$$) are rendered, and cells in “front” of the plane (where $$(\vec{x}-\vec{p}) \cdot \vec{n} > 0$$) are not rendered. Cells that intersect the plane are partially rendered (using constructive geometry via union and intersection commands in POV-ray).

Cell color definitions

Within <cell_color_definitions>, you’ll find multiple <cell_colors> blocks, each of which defines the live, dead, and necrotic colors for a specific cell type (with the type ID indicated in the attribute). These colors are only applied if use_standard_colors is set to true in options. See above.

The live colors are given as two rgb (red,green,blue) colors for the cytoplasm and nucleus of live cells. Each element of this triple can range from 0 to 1, and not from 0 to 255 as in many raw image formats. Next, finish specifies ambient (how much highly-scattered background ambient light illuminates the cell), diffuse (how well light rays can illuminate the surface), and specular (how much of a shiny reflective splotch the cell gets).

This is repeated to give the apoptotic and necrotic colors for the cell type.

<cell_colors type="0">
<live>
<cytoplasm>.25,1,.25</cytoplasm> <!-- red,green,blue -->
<nuclear>0.03,0.125</nuclear>
<finish>0.05,1,0.1</finish> <!-- ambient,diffuse,specular -->
</live>
<apoptotic>
<cytoplasm>1,0,0</cytoplasm> <!-- red,green,blue -->
<nuclear>0.125,0,0</nuclear>
<finish>0.05,1,0.1</finish> <!-- ambient,diffuse,specular -->
</apoptotic>
<necrotic>
<cytoplasm>1,0.5412,0.1490</cytoplasm> <!-- red,green,blue -->
<nuclear>0.125,0.06765,0.018625</nuclear>
<finish>0.01,0.5,0.1</finish> <!-- ambient,diffuse,specular -->
</necrotic>
</cell_colors>


Use multiple cell_colors blocks (each with type corresponding to the integer cell type) to define the colors of multiple cell types.

Using PhysiCell-povwriter

Use by the XML configuration file alone

The simplest syntax:

physicell$./povwriter  (Windows users: povwriter or povwriter.exe) will process ./config/povwriter-settings.xml and convert the single indicated PhysiCell snapshot to a .pov file. If you run POV-writer with the default configuration file in the povwriter structure (with the supplied sample data), it will render time index 3696 from the immunotherapy example in our 2018 PhysiCell Method Paper: physicell$ ./povwriter

povwriter version 1.0.0
================================================================================

Copyright (c) Paul Macklin 2019, on behalf of the PhysiCell project

Usage:
================================================================================
povwriter : run povwriter with config file ./config/settings.xml

povwriter FILENAME.xml : run povwriter with config file FILENAME.xml

povwriter x:y:z : run povwriter on data in FOLDER with indices from x
to y in incremenets of z

Example: ./povwriter 0:2:10 processes files:
./FOLDER/FILEBASE00000000_physicell_cells.mat
./FOLDER/FILEBASE00000002_physicell_cells.mat
...
./FOLDER/FILEBASE00000010_physicell_cells.mat
(See the config file to set FOLDER and FILEBASE)

povwriter x1,...,xn : run povwriter on data in FOLDER with indices x1,...,xn

Example: ./povwriter 1,3,17 processes files:
./FOLDER/FILEBASE00000001_physicell_cells.mat
./FOLDER/FILEBASE00000003_physicell_cells.mat
./FOLDER/FILEBASE00000017_physicell_cells.mat
(Note that there are no spaces.)
(See the config file to set FOLDER and FILEBASE)

Tutorial & documentation at http://MathCancer.org/blog/povwriter
================================================================================

Using config file ./config/povwriter-settings.xml ...
Using standard coloring function ...
Found 3 clipping planes ...
Found 2 cell color definitions ...
Processing file ./output/output00003696_cells_physicell.mat...
Matrix size: 32 x 66978
Creating file pov00003696.pov for output ...
Writing 66978 cells ...
done!

Done processing all 1 files!


The result is a single POV-ray file (pov00003696.pov) in the root directory.

Now, open that file in POV-ray (double-click the file if you are in Windows), choose one of your resolutions in your lefthand dropdown (I’ll choose 2160×2160 no antialiasing), and click the green “run” button.

You can watch the image as it renders. The result should be a PNG file (named pov00003696.png) that looks like this:

Using command-line options to process multiple times (option #1)

Now, suppose we have more outputs to process. We still state most of the options in the XML file as above, but now we also supply a command-line argument in the form of start:interval:end. If you’re still in the povwriter project, note that we have some more sample data there. Let’s grab and process it:

physicell$cd output physicell$ unzip more_samples.zip
Archive: more_samples.zip
inflating: output00000000_cells_physicell.mat
inflating: output00000001_cells_physicell.mat
inflating: output00000250_cells_physicell.mat
inflating: output00000300_cells_physicell.mat
inflating: output00000500_cells_physicell.mat
inflating: output00000750_cells_physicell.mat
inflating: output00001000_cells_physicell.mat
inflating: output00001250_cells_physicell.mat
inflating: output00001500_cells_physicell.mat
inflating: output00001750_cells_physicell.mat
inflating: output00002000_cells_physicell.mat
inflating: output00002250_cells_physicell.mat
inflating: output00002500_cells_physicell.mat
inflating: output00002750_cells_physicell.mat
inflating: output00003000_cells_physicell.mat
inflating: output00003250_cells_physicell.mat
inflating: output00003500_cells_physicell.mat
inflating: output00003696_cells_physicell.mat

physicell$ls citation and license.txt more_samples.zip output00000000_cells_physicell.mat output00000001_cells_physicell.mat output00000250_cells_physicell.mat output00000300_cells_physicell.mat output00000500_cells_physicell.mat output00000750_cells_physicell.mat output00001000_cells_physicell.mat output00001250_cells_physicell.mat output00001500_cells_physicell.mat output00001750_cells_physicell.mat output00002000_cells_physicell.mat output00002250_cells_physicell.mat output00002500_cells_physicell.mat output00002750_cells_physicell.mat output00003000_cells_physicell.mat output00003250_cells_physicell.mat output00003500_cells_physicell.mat output00003696.xml output00003696_cells_physicell.mat  Let’s go back to the parent directory and run povwriter: physicell$ ./povwriter 0:250:3500

povwriter version 1.0.0
================================================================================

Copyright (c) Paul Macklin 2019, on behalf of the PhysiCell project

Usage:
================================================================================
povwriter : run povwriter with config file ./config/settings.xml

povwriter FILENAME.xml : run povwriter with config file FILENAME.xml

povwriter x:y:z : run povwriter on data in FOLDER with indices from x
to y in incremenets of z

Example: ./povwriter 0:2:10 processes files:
./FOLDER/FILEBASE00000000_physicell_cells.mat
./FOLDER/FILEBASE00000002_physicell_cells.mat
...
./FOLDER/FILEBASE00000010_physicell_cells.mat
(See the config file to set FOLDER and FILEBASE)

povwriter x1,...,xn : run povwriter on data in FOLDER with indices x1,...,xn

Example: ./povwriter 1,3,17 processes files:
./FOLDER/FILEBASE00000001_physicell_cells.mat
./FOLDER/FILEBASE00000003_physicell_cells.mat
./FOLDER/FILEBASE00000017_physicell_cells.mat
(Note that there are no spaces.)
(See the config file to set FOLDER and FILEBASE)

Tutorial & documentation at http://MathCancer.org/blog/povwriter
================================================================================

Using config file ./config/povwriter-settings.xml ...
Using standard coloring function ...
Found 3 clipping planes ...
Found 2 cell color definitions ...
Matrix size: 32 x 18317
Processing file ./output/output00000000_cells_physicell.mat...
Creating file pov00000000.pov for output ...
Writing 18317 cells ...
Processing file ./output/output00002000_cells_physicell.mat...
Matrix size: 32 x 33551
Creating file pov00002000.pov for output ...
Writing 33551 cells ...
Processing file ./output/output00002500_cells_physicell.mat...
Matrix size: 32 x 43440
Creating file pov00002500.pov for output ...
Writing 43440 cells ...
Processing file ./output/output00001500_cells_physicell.mat...
Matrix size: 32 x 40267
Creating file pov00001500.pov for output ...
Writing 40267 cells ...
Processing file ./output/output00003000_cells_physicell.mat...
Matrix size: 32 x 56659
Creating file pov00003000.pov for output ...
Writing 56659 cells ...
Processing file ./output/output00001000_cells_physicell.mat...
Matrix size: 32 x 74057
Creating file pov00001000.pov for output ...
Writing 74057 cells ...
Processing file ./output/output00003500_cells_physicell.mat...
Matrix size: 32 x 66791
Creating file pov00003500.pov for output ...
Writing 66791 cells ...
Processing file ./output/output00000500_cells_physicell.mat...
Matrix size: 32 x 114316
Creating file pov00000500.pov for output ...
Writing 114316 cells ...
done!

Processing file ./output/output00000250_cells_physicell.mat...
Matrix size: 32 x 75352
Creating file pov00000250.pov for output ...
Writing 75352 cells ...
done!

Processing file ./output/output00002250_cells_physicell.mat...
Matrix size: 32 x 37959
Creating file pov00002250.pov for output ...
Writing 37959 cells ...
done!

Processing file ./output/output00001750_cells_physicell.mat...
Matrix size: 32 x 32358
Creating file pov00001750.pov for output ...
Writing 32358 cells ...
done!

Processing file ./output/output00002750_cells_physicell.mat...
Matrix size: 32 x 49658
Creating file pov00002750.pov for output ...
Writing 49658 cells ...
done!

Processing file ./output/output00003250_cells_physicell.mat...
Matrix size: 32 x 63546
Creating file pov00003250.pov for output ...
Writing 63546 cells ...
done!

done!

done!

done!

Processing file ./output/output00001250_cells_physicell.mat...
Matrix size: 32 x 54771
Creating file pov00001250.pov for output ...
Writing 54771 cells ...
done!

done!

done!

done!

Processing file ./output/output00000750_cells_physicell.mat...
Matrix size: 32 x 97642
Creating file pov00000750.pov for output ...
Writing 97642 cells ...
done!

done!

Done processing all 15 files!


Notice that the output appears a bit out of order. This is normal: povwriter is using 8 threads to process 8 files at the same time, and sending some output to the single screen. Since this is all happening simultaneously, it’s a bit jumbled (and non-sequential). Don’t panic. You should now have created pov00000000.povpov00000250.pov, … , pov00003500.pov.

Now, go into POV-ray, and choose “queue.” Click “Add File” and select all 15 .pov files you just created:

Hit “OK” to let it render all the povray files to create PNG files (pov00000000.png, … , pov00003500.png).

Using command-line options to process multiple times (option #2)

You can also give a list of indices. Here’s how we render time indices 250, 1000, and 2250:



Creating an animated GIF with ImageMagick

Suppose you want to create an animated GIF based on your images. I suggest first converting to JPG (see above) and then using ImageMagick again. Here, I’m adding a 20 ms delay between frames:

physicell\$ magick convert -delay 20 *.jpg out.gif


Here’s the result:

Creating a compressed movie with Mencoder

Syntax coming later.

Closing thoughts and future work

In the future, we will probably allow more control over the clipping planes and a bit more debugging on how to handle planes that don’t pass through the origin. (First thoughts: we need to change how we use union and intersection commands in the POV-ray outputs.)

We should also look at adding some transparency for the cells. I’d prefer something like rgba (red-green-blue-alpha), but POV-ray uses filters and transmission, and we want to make sure to get it right.

Lastly, it would be nice to find a balance between the current very simple camera setup and better control.

Thanks for reading this PhysiCell Friday tutorial! Please do give PhysiCell at try (at http://PhysiCell.org) and read the method paper at PLoS Computational Biology.

Tags :

Setting up the PhysiCell microenvironment with XML

As of release 1.6.0, users can define all the chemical substrates in the microenvironment with an XML configuration file. (These are stored by default in ./config/. The default parameter file is ./config/PhysiCell_settings.xml.) This should make it much easier to set up the microenvironment (previously required a lot of manual C++), as well as make it easier to change parameters and initial conditions.

This tutorial will show you the key techniques to use these features. (See the User_Guide for full documentation.) First, let’s create a barebones 2D project by populating the 2D template project. In a terminal shell in your root PhysiCell directory, do this:

make template2D


We will use this 2D project template for the remainder of the tutorial. We assume you already have a working copy of PhysiCell installed, version 1.6.0 or later. (If not, visit the PhysiCell tutorials to find installation instructions for your operating system.)

Microenvironment setup in the XML configuration file

Next, let’s look at the parameter file. In your text editor of choice, open up ./config/PhysiCell_settings.xml, and browse down to <microenvironment_setup>:

<microenvironment_setup>
<variable name="oxygen" units="mmHg" ID="0">
<physical_parameter_set>
<diffusion_coefficient units="micron^2/min">100000.0</diffusion_coefficient>
<decay_rate units="1/min">0.1</decay_rate>
</physical_parameter_set>
<initial_condition units="mmHg">38.0</initial_condition>
<Dirichlet_boundary_condition units="mmHg" enabled="true">38.0</Dirichlet_boundary_condition>
</variable>

<options>
<track_internalized_substrates_in_each_agent>false</track_internalized_substrates_in_each_agent>
<!-- not yet supported -->
<initial_condition type="matlab" enabled="false">
<filename>./config/initial.mat</filename>
</initial_condition>
<!-- not yet supported -->
<dirichlet_nodes type="matlab" enabled="false">
<filename>./config/dirichlet.mat</filename>
</dirichlet_nodes>
</options>
</microenvironment_setup>


Notice a few trends:

• The <variable> XML element (tag) is used to define a chemical substrate in the microenvironment. The attributes say that it is named oxygen, and the units of measurement are mmHg. Notice also that the ID is 0: this unique integer identifier helps for finding and accessing the substrate within your PhysiCell project. Make sure your first substrate ID is 0, since C++ starts indexing at 0.
• Within the <variable> block, we set the properties of this substrate:
• <diffusion_coefficient> sets the (uniform) diffusion constant for the substrate.
• <decay_rate> is the substrate’s background decay rate.
• <initial_condition> is the value the substrate will be (uniformly) initialized to throughout the domain.
• <Dirichlet_boundary_condition> is the value the substrate will be set to along the outer computational boundary throughout the simulation, if you set enabled=true. If enabled=false, then PhysiCell (via BioFVM) will use Neumann (zero flux) conditions for that substrate.
• The <options> element helps configure other simulation behaviors:
• Use <calculate_gradients> to control whether PhysiCell computes all chemical gradients at each time step. Set this to true to enable accurate gradients (e.g., for chemotaxis).
• Use <track_internalized_substrates_in_each_agent> to enable or disable the PhysiCell feature of actively tracking the total amount of internalized substrates in each individual agent. Set this to true to enable the feature.
• <initial_condition> is reserved for a future release where we can specify non-uniform initial conditions as an external file (e.g., a CSV or Matlab file). This is not yet supported.
• <dirichlet_nodes> is reserved for a future release where we can specify Dirchlet nodes at any location in the simulation domain with an external file. This will be useful for irregular domains, but it is not yet implemented.

Note that PhysiCell does not convert units. The units attributes are helpful for clarity between users and developers, to ensure that you have worked in consistent length and time units. By default, PhysiCell uses minutes for all time units, and microns for all spatial units.

Changing an existing substrate

Let’s modify the oxygen variable to do the following:

• Change the diffusion coefficient to 120000 $$\mu\mathrm{m}^2 / \mathrm{min}$$
• Change the initial condition to 40 mmHg
• Change the oxygen Dirichlet boundary condition to 42.7 mmHg

If you modify the appropriate fields in the <microenvironment_setup> block, it should look like this:

<microenvironment_setup>
<variable name="oxygen" units="mmHg" ID="0">
<physical_parameter_set>
<diffusion_coefficient units="micron^2/min">120000.0</diffusion_coefficient>
<decay_rate units="1/min">0.1</decay_rate>
</physical_parameter_set>
<initial_condition units="mmHg">40.0</initial_condition>
<Dirichlet_boundary_condition units="mmHg" enabled="true">42.7</Dirichlet_boundary_condition>
</variable>

<options>
<track_internalized_substrates_in_each_agent>false</track_internalized_substrates_in_each_agent>
<!-- not yet supported -->
<initial_condition type="matlab" enabled="false">
<filename>./config/initial.mat</filename>
</initial_condition>
<!-- not yet supported -->
<dirichlet_nodes type="matlab" enabled="false">
<filename>./config/dirichlet.mat</filename>
</dirichlet_nodes>
</options>
</microenvironment_setup>


Let’s add a new dimensionless substrate glucose with the following:

• Diffusion coefficient is 18000 $$\mu\mathrm{m}^2 / \mathrm{min}$$
• No decay rate
• The initial condition is 1 (dimensionless)
• Neumann (no flux) boundary conditions

To add the new variable, I suggest copying an existing variable (in this case, oxygen) and modifying to:

• change the name and units throughout
• increase the ID by one
• write in the appropriate initial and boundary conditions

If you modify the appropriate fields in the <microenvironment_setup> block, it should look like this:

<microenvironment_setup>
<variable name="oxygen" units="mmHg" ID="0">
<physical_parameter_set>
<diffusion_coefficient units="micron^2/min">120000.0</diffusion_coefficient>
<decay_rate units="1/min">0.1</decay_rate>
</physical_parameter_set>
<initial_condition units="mmHg">40.0</initial_condition>
<Dirichlet_boundary_condition units="mmHg" enabled="true">42.7</Dirichlet_boundary_condition>
</variable>

<variable name="glucose" units="dimensionless" ID="1">
<physical_parameter_set>
<diffusion_coefficient units="micron^2/min">18000.0</diffusion_coefficient>
<decay_rate units="1/min">0.0</decay_rate>
</physical_parameter_set>
<initial_condition units="dimensionless">1</initial_condition>
<Dirichlet_boundary_condition units="dimensionless" enabled="false">0</Dirichlet_boundary_condition>
</variable>

<options>
<track_internalized_substrates_in_each_agent>false</track_internalized_substrates_in_each_agent>
<!-- not yet supported -->
<initial_condition type="matlab" enabled="false">
<filename>./config/initial.mat</filename>
</initial_condition>
<!-- not yet supported -->
<dirichlet_nodes type="matlab" enabled="false">
<filename>./config/dirichlet.mat</filename>
</dirichlet_nodes>
</options>
</microenvironment_setup>


Closing thoughts and future work

In the future, we plan to develop more of the options to allow users to set set the initial conditions externally and import them (via an external file), and to allow them to set up more complex domains by importing Dirichlet nodes.

More broadly, we are working to push more model specification from raw C++ to imported XML. It is our hope that this will vastly simplify model development, facilitate creation of graphical model editing tools, and ultimately broaden the class of developers who can use and contribute to PhysiCell. Thanks for giving it a try!

Tags :

A small computational thought experiment

In Macklin (2017), I briefly touched on a simple computational thought experiment that shows that for a group of homogeneous cells, you can observe substantial heterogeneity in cell behavior. This “thought experiment” is part of a broader preview and discussion of a fantastic paper by Linus Schumacher, Ruth Baker, and Philip Maini published in Cell Systems, where they showed that a migrating collective homogeneous cells can show heterogeneous behavior when quantitated with new migration metrics. I highly encourage you to check out their work!

In this blog post, we work through my simple thought experiment in a little more detail.

Note: If you want to reference this blog post, please cite the Cell Systems preview article:

P. Macklin, When seeing isn’t believing: How math can guide our interpretation of measurements and experiments. Cell Sys., 2017 (in press). DOI: 10.1016/j.cells.2017.08.005

The thought experiment

Consider a simple (and widespread) model of a population of cycling cells: each virtual cell (with index i) has a single “oncogene” $$r_i$$ that sets the rate of progression through the cycle. Between now (t) and a small time from now ( $$t+\Delta t$$), the virtual cell has a probability $$r_i \Delta t$$ of dividing into two daughter cells. At the population scale, the overall population growth model that emerges from this simple single-cell model is:
$\frac{dN}{dt} = \langle r\rangle N,$
where $$\langle r \rangle$$ the mean division rate over the cell population, and is the number of cells. See the discussion in the supplementary information for Macklin et al. (2012).

Now, suppose (as our thought experiment) that we could track individual cells in the population and track how long it takes them to divide. (We’ll call this the division time.) What would the distribution of cell division times look like, and how would it vary with the distribution of the single-cell rates $$r_i$$?

Mathematical method

In the Matlab script below, we implement this cell cycle model as just about every discrete model does. Here’s the pseudocode:

t = 0;
while( t < t_max )
for i=1:Cells.size()
u = random_number();
if( u < Cells[i].birth_rate * dt )
Cells[i].division_time = Cells[i].age;
Cells[i].divide();
end
end
t = t+dt;
end


That is, until we’ve reached the final simulation time, loop through all the cells and decide if they should divide: For each cell, choose a random number between 0 and 1, and if it’s smaller than the cell’s division probability ($$r_i \Delta t$$), then divide the cell and write down the division time.

As an important note, we have to track the same cells until they all divide, rather than merely record which cells have divided up to the end of the simulation. Otherwise, we end up with an observational bias that throws off our recording. See more below.

The sample code

http://MathCancer.org/files/matlab/thought_experiment_matlab(Macklin_Cell_Systems_2017).zip

Extract all the files, and run “thought_experiment” in Matlab (or Octave, if you don’t have a Matlab license or prefer an open source platform) for the main result.

All these Matlab files are available as open source, under the GPL license (version 3 or later).

Results and discussion

First, let’s see what happens if all the cells are identical, with $$r = 0.05 \textrm{ hr}^{-1}$$. We run the script, and track the time for each of 10,000 cells to divide. As expected by theory (Macklin et al., 2012) (but perhaps still a surprise if you haven’t looked), we get an exponential distribution of division times, with mean time $$1/\langle r \rangle$$:

So even in this simple model, a homogeneous population of cells can show heterogeneity in their behavior. Here’s the interesting thing: let’s now give each cell its own division parameter $$r_i$$ from a normal distribution with mean $$0.05 \textrm{ hr}^{-1}$$ and a relative standard deviation of 25%:

If we repeat the experiment, we get the same distribution of cell division times!

So in this case, based solely on observations of the phenotypic heterogeneity (the division times), it is impossible to distinguish a “genetically” homogeneous cell population (one with identical parameters) from a truly heterogeneous population. We would require other metrics, like tracking changes in the mean division time as cells with a higher $$r_i$$ out-compete the cells with lower $$r_i$$.

Lastly, I want to point out that caution is required when designing these metrics and single-cell tracking. If instead we had tracked all cells throughout the simulated experiment, including new daughter cells, and then recorded the first 10,000 cell division events, we would get a very different distribution of cell division times:

By only recording the division times for the cells that have divided, and not those that haven’t, we bias our observations towards cells with shorter division times. Indeed, the mean division time for this simulated experiment is far lower than we would expect by theory. You can try this one by running “bad_thought_experiment”.

This post is an expansion of our recent preview in Cell Systems in Macklin (2017):

P. Macklin, When seeing isn’t believing: How math can guide our interpretation of measurements and experiments. Cell Sys., 2017 (in press). DOI: 10.1016/j.cells.2017.08.005

And the original work on apparent heterogeneity in collective cell migration is by Schumacher et al. (2017):

L. Schumacher et al., Semblance of Heterogeneity in Collective Cell MigrationCell Sys., 2017 (in press). DOI: 10.1016/j.cels.2017.06.006

You can read some more on relating exponential distributions and Poisson processes to common discrete mathematical models of cell populations in Macklin et al. (2012):

P. Macklin, et al., Patient-calibrated agent-based modelling of ductal carcinoma in situ (DCIS): From microscopic measurements to macroscopic predictions of clinical progressionJ. Theor. Biol. 301:122-40, 2012. DOI: 10.1016/j.jtbi.2012.02.002.

Lastly, I’d be delighted if you took a look at the open source software we have been developing for 3-D simulations of multicellular systems biology:

http://OpenSource.MathCancer.org

And you can always keep up-to-date by following us on Twitter: @MathCancer.

MathCancer C++ Style and Practices Guide

As PhysiCell, BioFVM, and other open source projects start to gain new users and contributors, it’s time to lay out a coding style. We have three goals here:

1. Consistency: It’s easier to understand and contribute to the code if it’s written in a consistent way.
2. Readability: We want the code to be as readable as possible.
3. Reducing errors: We want to avoid coding styles that are more prone to errors. (e.g., code that can be broken by introducing whitespace).

So, here is the guide (revised June 2017). I expect to revise this guide from time to time.

Place braces on separate lines in functions and classes.

I find it much easier to read a class if the braces are on separate lines, with good use of whitespace. Remember: whitespace costs almost nothing, but reading and understanding (and time!) are expensive.

DON’T

class Cell{
public:
double some_variable;
bool some_extra_variable;

Cell(); };

class Phenotype{
public:
double some_variable;
bool some_extra_variable;

Phenotype();
};


DO:

class Cell
{
public:
double some_variable;
bool some_extra_variable;

Cell();
};

class Phenotype
{
public:
double some_variable;
bool some_extra_variable;

Phenotype();
};


Enclose all logic in braces, even when optional.

In C/C++, you can omit the curly braces in some cases. For example, this is legal

if( distance > 1.5*cell_radius )
interaction = false;
force = 0.0; // is this part of the logic, or a separate statement?
error = false;


However, this code is ambiguous to interpret. Moreover, small changes to whitespace–or small additions to the logic–could mess things up here. Use braces to make the logic crystal clear:

DON’T

if( distance > 1.5*cell_radius )
interaction = false;
force = 0.0; // is this part of the logic, or a separate statement?
error = false;

if( condition1 == true )
do_something1 = true;
elseif( condition2 == true )
do_something2 = true;
else
do_something3 = true;


DO

if( distance > 1.5*cell_radius )
{
interaction = false;
force = 0.0;
}
error = false;

if( condition1 == true )
{ do_something1 = true; }
elseif( condition2 == true )
{ do_something2 = true; }
else
{ do_something3 = true; }


Put braces on separate lines in logic, except for single-line logic.

This style rule relates to the previous point, to improve readability.

DON’T

if( distance > 1.5*cell_radius ){
interaction = false;
force = 0.0; }

if( condition1 == true ){ do_something1 = true; }
elseif( condition2 == true ){
do_something2 = true; }
else
{ do_something3 = true; error = true; }


DO

if( distance > 1.5*cell_radius )
{
interaction = false;
force = 0.0;
}

if( condition1 == true )
{ do_something1 = true; } // this is fine
elseif( condition2 == true )
{
do_something2 = true; // this is better
}
else
{
do_something3 = true;
error = true;
}


See how much easier that code is to read? The logical structure is crystal clear, and adding more to the logic is simple.

End all functions with a return, even if void.

For clarity, definitively state that a function is done by using return.

DON’T

void my_function( Cell& cell )
{
cell.phenotype.volume.total *= 2.0;
cell.phenotype.death.rates[0] = 0.02;
// Are we done, or did we forget something?
// is somebody still working here?
}


DO

void my_function( Cell& cell )
{
cell.phenotype.volume.total *= 2.0;
cell.phenotype.death.rates[0] = 0.02;
return;
}


Use tabs to indent the contents of a class or function.

This is to make the code easier to read. (Unfortunately PHP/HTML makes me use five spaces here instead of tabs.)

DON’T

class Secretion
{
public:
std::vector<double> secretion_rates;
std::vector<double> uptake_rates;
std::vector<double> saturation_densities;
};

void my_function( Cell& cell )
{
cell.phenotype.volume.total *= 2.0;
cell.phenotype.death.rates[0] = 0.02;
return;
}


DO

class Secretion
{
public:
std::vector<double> secretion_rates;
std::vector<double> uptake_rates;
std::vector<double> saturation_densities;
};

void my_function( Cell& cell )
{
cell.phenotype.volume.total *= 2.0;
cell.phenotype.death.rates[0] = 0.02;
return;
}


Use a single space to indent public and other keywords in a class.

This gets us some nice formatting in classes, without needing two tabs everywhere.

DON’T

class Secretion
{
public:
std::vector<double> secretion_rates;
std::vector<double> uptake_rates;
std::vector<double> saturation_densities;
}; // not enough whitespace

class Errors
{
private:
public:
std::string error_message;
int error_code;
}; // too much whitespace!


DO

class Secretion
{
private:
public:
std::vector<double> secretion_rates;
std::vector<double> uptake_rates;
std::vector<double> saturation_densities;
};

class Errors
{
private:
public:
std::string error_message;
int error_code;
};


Avoid arcane operators, when clear logic statements will do.

It can be difficult to decipher code with statements like this:

phenotype.volume.fluid=phenotype.volume.fluid<0?0:phenotype.volume.fluid;


Moreover, C and C++ can treat precedence of ternary operators very differently, so subtle bugs can creep in when using the “fancier” compact operators. Variations in how these operators work across languages are an additional source of error for programmers switching between languages in their daily scientific workflows. Wherever possible (and unless there is a significant performance reason to do so), use clear logical structures that are easy to read even if you only dabble in C/C++. Compiler-time optimizations will most likely eliminate any performance gains from these goofy operators.

DON’T

// if the fluid volume is negative, set it to zero
phenotype.volume.fluid=phenotype.volume.fluid<0.0?0.0:pCell->phenotype.volume.fluid;


DO

if( phenotype.volume.fluid < 0.0 )
{
phenotype.volume.fluid = 0.0;
}


Here’s the funny thing: the second logic is much clearer, and it took fewer characters, even with extra whitespace for readability!

Pass by reference where possible.

Passing by reference is a great way to boost performance: we can avoid (1) allocating new temporary memory, (2) copying data into the temporary memory, (3) passing the temporary data to the function, and (4) deallocating the temporary memory once finished.

DON’T

double some_function( Cell cell )
{
return = cell.phenotype.volume.total + 3.0;
}
// This copies cell and all its member data!


DO

double some_function( Cell& cell )
{
return = cell.phenotype.volume.total + 3.0;
}
// This just accesses the original cell data without recopying it.


Where possible, pass by reference instead of by pointer.

There is no performance advantage to passing by pointers over passing by reference, but the code is simpler / clearer when you can pass by reference. It makes code easier to write and understand if you can do so. (If nothing else, you save yourself character of typing each time you can replace “->” by “.”!)

DON’T

double some_function( Cell* pCell )
{
return = pCell->phenotype.volume.total + 3.0;
}
// Writing and debugging this code can be error-prone.


DO

double some_function( Cell& cell )
{
return = cell.phenotype.volume.total + 3.0;
}
// This is much easier to write.


Be careful with static variables. Be thread safe!

PhysiCell relies heavily on parallelization by OpenMP, and so you should write functions under the assumption that they may be executed many times simultaneously. One easy source of errors is in static variables:

DON’T

double some_function( Cell& cell )
{
static double four_pi = 12.566370614359172;
static double output;
output *= output;
output *= four_pi;
return output;
}
// If two instances of some_function are running, they will both modify
// the *same copy* of output


DO

double some_function( Cell& cell )
{
static double four_pi = 12.566370614359172;
double output;
output *= output;
output *= four_pi;
return output;
}
// If two instances of some_function are running, they will both modify
// the their own copy of output, but still use the more efficient, once-
// allocated copy of four_pi. This one is safe for OpenMP.


Use std:: instead of “using namespace std”

PhysiCell uses the BioFVM and PhysiCell namespaces to avoid potential collision with other codes. Other codes using PhysiCell may use functions that collide with the standard namespace. So, we formally use std:: whenever using functions in the standard namespace.

DON’T

using namespace std;

cout << "Hi, Mom, I learned to code today!" << endl;
string my_string = "Cheetos are good, but Doritos are better.";
cout << my_string << endl;

vector<double> my_vector;
vector.resize( 3, 0.0 );


DO

std::cout << "Hi, Mom, I learned to code today!" << std::endl;
std::string my_string = "Cheetos are good, but Doritos are better.";
std::cout << my_string << std::endl;

std::vector<double> my_vector;
my_vector.resize( 3, 0.0 );


Camelcase is ugly. Use underscores.

This is purely an aesthetic distinction, but CamelCaseCodeIsUglyAndDoYouUseDNAorDna?

DON’T

double MyVariable1;
bool ProteinsInExosomes;
int RNAtranscriptionCount;

void MyFunctionDoesSomething( Cell& ImmuneCell );


DO

double my_variable1;
bool proteins_in_exosomes;
int RNA_transcription_count;

void my_function_does_something( Cell& immune_cell );


Use capital letters to declare a class. Use lowercase for instances.

To help in readability and consistency, declare classes with capital letters (but no camelcase), and use lowercase for instances of those classes.

DON’T

class phenotype;

class cell
{
public:
std::vector<double> position;
phenotype Phenotype;
};

class ImmuneCell : public cell
{
public:
std::vector<double> surface_receptors;
};

void do_something( cell& MyCell , ImmuneCell& immuneCell );

cell Cell;
ImmuneCell MyImmune_cell;

do_something( Cell, MyImmune_cell );


DO

class Phenotype;

class Cell
{
public:
std::vector<double> position;
Phenotype phenotype;
};

class Immune_Cell : public Cell
{
public:
std::vector<double> surface_receptors;
};

void do_something( Cell& my_cell , Immune_Cell& immune_cell );

Cell cell;
Immune_Cell my_immune_cell;

do_something( cell, my_immune_cell );


DCIS modeling paper accepted

Recently, I wrote about a major work we submitted to the Journal of Theoretical Biology: “Patient-calibrated agent-based modelling of ductal carcinoma in situ (DCIS): From microscopic measurements to macroscopic predictions of clinical progression.”

I am pleased to report that our paper has now been accepted.  You can download the accepted preprint here. We also have a lot of supplementary material, including simulation movies, simulation datasets (for 0, 15, 30, adn 45 days of growth), and open source C++ code for postprocessing and visualization.

I discussed the results in detail here, but here’s the short version:

1. We use a mechanistic, agent-based model of individual cancer cells growing in a duct. Cells are moved by adhesive and repulsive forces exchanged with other cells and the basement membrane.  Cell phenotype is controlled by stochastic processes.
2. We constrained all parameter expected to be relatively independent of patients by a careful analysis of the experimental biological and clinical literature.
3. We developed the very first patient-specific calibration method, using clinically-accessible pathology.  This is a key point in future patient-tailored predictions and surgical/therapeutic planning.
4. The model made numerous quantitative predictions, such as:
1. The tumor grows at a constant rate, between 7 to 10 mm/year. This is right in the middle of the range reported in the clinic.
2. The tumor’s size in mammgraphy is linearly correlated with the post-surgical pathology size.  When we linearly extrapolate our correlation across two orders of magnitude, it goes right through the middle of a cluster of 87 clinical data points.
3. The tumor necrotic core has an age structuring: with oldest, calcified material in the center, and newest, most intact necrotic cells at the outer edge.
4. The appearance of a “typical” DCIS duct cross-section varies with distance from the leading edge; all types of cross-sections predicted by our model are observed in patient pathology.
5. The model also gave new insight on the underlying biology of breast cancer, such as:
1. The split between the viable rim and necrotic core (observed almost universally in pathology) is not just an artifact, but an actual biomechanical effect from fast necrotic cell lysis.
2. The constant rate of tumor growth arises from the biomechanical stress relief provided by lysing necrotic cells. This points to the critical role of intracellular and intra-tumoral water transport in determining the qualitative and quantitative behavior of tumors.
3. Pyknosis (nuclear degradation in necrotic cells), must occur at a time scale between that of cell lysis (on the order of hours) and cell calcification (on the order of weeks).
4. The current model cannot explain the full spectrum of calcification types; other biophysics, such as degradation over a long, 1-2 month time scale, must be at play.
I hope you enjoy this article and find it useful. It is our hope that it will help drive our field from qualitative theory towards quantitative, patient-tailored predictions.
Direct link to the preprint: http://www.mathcancer.org/Publications.php#macklin12_jtb
I want to express my greatest thanks to my co-authors, colleagues, and the editorial staff at the Journal of Theoretical Biology.

Now hiring: Postdoctoral Researcher

I just posted a job opportunity for a postdoctoral researcher for computational modeling of breast, prostate, and metastatic cancer, with a heavy emphasis on calibrating (and validating!) to in vitro, in vivo, and clinical data.

If you’re a talented computational modeler and have a passion for applying mathematics to make a difference in clinical care, please read the job posting and apply!

(Note: Interested students in the Los Angeles/Orange County area may want to attend my applied math seminar talk at UCI next week to learn more about this work.)