ParaView for PhysiCell – Part 1

In this tutorial, we discuss the topic of visualizing data that is generated by PhysiCell. Specifically, we discuss the visualization of cells. In a later post, we’ll discuss options for visualizing the microenvironment. For 2-D models, PhysiCell generates Scalable Vector Graphics (SVG) files that depict cells’ positions, sizes (volumes), and colors (virtual pathology). Obviously, visualizing cells from 3-D models is more challenging. (Note: SVG files are also generated for 3-D models; however, they capture only the cells that intersect the Z=0 plane). Until now, we have only discussed a couple of applications for visualizing 3-D data from PhysiCell: MATLAB (or Octave) and POV-Ray. In this post, we describe ParaView: an open source, data analysis and visualization application from Kitware. ParaView can be used to visualize a wide variety of scientific data, as depicted on their gallery page.

Preliminary steps

Before we get started, just a reminder – if you have any problems or questions, please join our mailing list (physicell-users@googlegroups.com) to get help. In preparation for using the customized PhysiCell data reader in ParaView, you will need to have a specific Python module, scipy, installed. Python will be a useful language for other PhysiCell data analysis and visualization tasks too, so having it installed on your computer will come in handy later, beyond using ParaView. The confusion of installing/using Python (and the scipy module) for ParaView is due to multiple factors:

  • you may or may not have Administrator permission on your computer,
  • there are different versions of Python, including the major version confusion – 2 vs. 3,
  • there are different distributions of Python, and
  • ParaView comes with its own built-in Python (version 2), but it isn’t easily extensible.

Before we get operating system specific, we just want to point out that it is possible to have multiple versions and/or distributions of Python on your computer. Unfortunately, there is no guarantee that you can mix modules of one with another. This is especially true for one popular Python distribution, Anaconda, and any other distribution.

We now provide some detailed instructions for the primary operating systems. In the sections that follow, we assume you have Admin permission on your computer. If you do not and you need to install Python + scipy as a standard user, see Appendix A at the end.

Windows

Windows does not come with Python, by default. Therefore, you will need to download/install it from python.org – get the latest 2.x version – currently https://www.python.org/downloads/release/python-2714/.

During the installation process, you will be asked if you want to install for all users or just yourself. That is up to you.

You’ll have the option of changing the default installation directory. We recommend keeping the default: C:\Python27.

Finally, you will have the choice of having this python executable be your default. That is up to you. If you have another Python distribution installed that you want to keep as your default, then you should leave this choice unchecked, i.e. leave the “X”. But if you do want to use this one as your default, select “Add python.exe to Path”:

After completing the Python 2.7 installation, open a Command Prompt shell and run (or if you selected to use this python.exe as your default in the above step, you can just use ‘python’ instead of specifying the full path):

c:\python27\python.exe -m pip install scipy


This will download and install the scipy module which is what our PhysiCell data reader uses. You can verify that it got installed by running python and trying to import it:

c:\>python27\python.exe
Python 2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:25:58) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> quit()

OSX and Linux

Both OSX and Linux come with a system-level version of Python pre-installed (/usr/bin/python). Regardless of whether you have installed additional versions of Python, you will want to make sure the pre-installed version has the scipy module. OSX should already have it. However, Linux will not and you will need to install it via:

/usr/bin/python -m pip install scipy

You can test if scipy is accessible by simply trying to import it:

/usr/bin/python
...
>>> import scipy
>>> quit()

Installing ParaView

After you have successfully installed Python + scipy, download and install the (binary) ParaView application for your particular platform. The current, stable version is 5.4.1.

Windows

Assuming you have Admin permission, download/install ParaView-5.4.1-Qt5-OpenGL2-Windows-64bit.exe. If you do not have Admin permission, see Appendix A.

OSX or Linux

On OSX, assuming you have Admin permissions, download/install ParaView-5.4.1-Qt5-OpenGL2-MPI-OSX10.8-64bit.dmg. If you do not have Admin permission, see Appendix A.

On Linux, download ParaView-5.4.1-Qt5-OpenGL2-MPI-Linux-64bit.tar.gz and uncompress it into an appropriate directory. For example, if you want to make ParaView available to everyone, you may want to uncompress it into /usr/local. If you only want to make it available for your personal use, see Appendix A.

Running ParaView

Before starting the ParaView application, you can set an environment variable, PHYSICELL_DATA, to be the full path to the directory where the PhysiCell data can be found. This will make it easier for the custom data reader (a Python script) in the ParaView pipeline to find the data. For example, in the next section we provide a link to some sample PhysiCell data. If you simply uncompress that zip archive into your Downloads directory, then (on Linux/OSX bash) you could:

$ export PHYSICELL_DATA=/path-to-your-home-dir/Downloads

(this Windows tutorial, while aimed at editing your PATH, will also help you find the environment variables setting).

If you choose not to set the PHYSICELL_DATA environment variable, the custom data reader will look in your user Downloads directory. Alternatively, you can simply edit the ParaView custom data reader Python script to point to your data directory (and then you would probably want to File -> Save State).

NOTE: if you are using OSX and want to use the PHYSICELL_DATA environment variable, you should start the ParaView application from the Terminal, e.g. $ /Applications/ParaView-5.4.1.app/Contents/MacOS/paraview &

Finally, go ahead and start the ParaView application. You should see a blank GUI (Figure 1). Don’t be frightened by the complexity of the GUI – yes, there are several widgets, but we will walk you down a minimal path to help you visualize PhysiCell cell data. Of course you have access to all of ParaView’s documentation – the Getting Started, Guide, and Tutorial under the Help menu are a good place to start. In addition to the downloadable (.pdf) documentation, there is also in-depth information online: https://www.paraview.org/Wiki/ParaView


Figure 1. The ParaView application

Note: the 3-D axis in the lower-left corner of the RenderView does not represent the origin. It is just a reference axis to provide 3-D orientation.

Before you get started with examples, you should open ParaView’s Settings (Preferences), select the General tab, and make sure “Auto Apply” is checked. This will avoid the need to manually “Apply” changes you make to an object’s properties.

Cancer immunity 3D example

We will use sample output data from the cancer_immune_3D project – one of the sample projects bundled with PhysiCell. You can refer to the PhysiCell Quickstart guide if you want to actually compile and run that project. But to simplify this tutorial, we provide sample data (a single time step) from that project, at:
https://sourceforge.net/projects/physicell/files/Tutorials/MultiCellDS/3D_PhysiCell_matlab_sample.zip/download

After you extract the files from this .zip, the file of interest for this tutorial is called output00003696_cells_physicell.mat. (And remember, as discussed above, to set the PHYSICELL_DATA environment variable to point to the directory where you extracted the files).

Additionally, for this tutorial, you’ll need some predefined ParaView state files (*.pvsm). A state file is just what it sounds like – it saves the entire state of a ParaView session so that you can easily re-use it. For this tutorial, we have provided the following state files to help you get started:

  • physicell_glyphs.pvsm – render cells as simple vertex glyphs
  • physicell_z0slice.pvsm – render the intersection of spherical glyphs with the Z=0 plane
  • physicell_3clip_planes_ospray.pvsm – OSPRay renderer with 3 clipping planes
  • physicell_cells_animate.pvsm – demonstrate how to do animation

Download them here: physicell_paraview_states.zip

Figure 2 is the SVG image from the sample data. It depicts the (3-D) cells that intersect the Z=0 plane.


Figure 2. snapshot00003696.svg generated by PhysiCell

Reading PhysiCell (cell) data

One way that ParaView offers extensibility is via Python scripts. To make it easier to read data generated by PhysiCell, we provide users with a “Programmable Source” that will read and process data in a file. In the ParaView screen-captured figures below, the Programmable Source (“PhysiCellReader_cells”) will be the very first module in the pipeline.

For this exercise, you can File->LoadState and select the physicell_glyphs.pvsm file that you downloaded above. Assuming you previously copied the output00003696_cells_physicell.mat sample data file into one of the default directories as described above, it should display the results in Figure 3. (Otherwise, you will likely see an error message appear, in which case see Appendix B). These are the simplest (and fastest) glyphs to represent PhysiCell cells. They are known as 2-D Vertex glyphs, although the 2-D is misleading since they are rendered in 3-space. At this point, you can interact – rotate, zoom, pan, with the visualization (rf. Sect. 4.4.2 in the ParaViewGuide-5.4.0 for an explanation of the controls, but basically: left-mouse button hold while moving cursor to rotate; Ctl-left-mouse for zoom; Ctl-Shift-left-mouse to pan).

For this particular ParaView state file, we use the following code to assign colors to cells (Note: the PhysiCell code in /custom_modules creates the SVG file using colors based on cells’ neighbor information. This information is not saved in the .mat output files, therefore we cannot faithfully reproduce SVG cell colors here):

   # The following lines assign an integer to represent
   # a color, defined in a Color Map.
   sval = 0   # immune cells are yellow?
   if val[5,idx] == 1:  # [5]=cell_type
     sval = 1   # lime green
   if (val[6,idx] == 6) or (val[6,idx] == 7):
     sval = 0
   if val[7,idx] == 100:  # [7]=current_phase
     sval = 3   # apoptotic: red
   if val[7,idx] > 100 and val[7,idx] < 104:
     sval = 2   # necrotic: brownish


Figure 3. Glyphs as vertices


Figure 4. Glyphs as spheres

The next exercise is a simple extension of the previous. We want to add a slice to our pipeline that will intersect the cells (spherical glyphs). Assuming the slice is the Z=0 plane, this will approximate the SVG in Figure 2. So, select Edit->ResetSession to start from scratch, then File->LoadState and select the physicell_z0slice.pvsm file. Note that the “Glyph1” node in the pipeline is invisible (its “eyeball” is toggled off). If you make it visible (select the eyeball), you will essentially reproduce Figure 4. But for this exercise, we want Glyph1 to be invisible and Slice1 visible. If you select Slice1 (its green box) in the pipeline, you can select its Properties tab (at the top) to see all its properties. Of particular interest is the “Show Plane” checkbox at the top.

If this is checked on, you can interactively translate the plane along its normal, as well as select and rotate the normal itself. Try it! You can also “hardcode” the slice plane parameters in its properties panel.


Figure 5. Z=0 slice through the spherical glyphs (approximate SVG slice)

Our final exercise with the sample dataset is to visualize cells using a higher quality renderer in ParaView known as OSPRay. So, as before, Edit->ResetSession to start from scratch, then File->LoadState and select the physicell_3clip_planes_ospray.pvsm file. This will create a pipeline that has 3 clipping planes aligned such that an octant of our spheroidal cell cluster is clipped away, letting us peer into its core (Figure 6). As with the previous slice plane, you can interactively reposition one or more of the clipping planes here.

Note: One (temporary) downside of using the OSPRay renderer is that cells (spheres) cannot be arbitrarily scaled in size. This will be fixed in a future release of ParaView. For now, we get around that by shrinking the distance of each cell from the origin. Obviously this is not a perfect solution, but for cells that are sort of clustered around the origin, it offers a decent approximation. We mention this 1) to disclose the inaccuracy, and 2) in case you look at the data ranges in the “Information” view/tab and notice they are not reflecting the domain of your simulation.


Figure 6. OSPRay-rendered cells, showing 1 of 3 interactive clipping planes

Animation in Time

At some point, you’ll want to see the dynamics of your simulation. This is where ParaView’s animation functionality comes in handy. It provides an easy way to let you render all (or a subset) of your PhysiCell output files, control the animation, and also save images (as .png files).

For this exercise, you will obviously want to have generated multiple output files from a PhysiCell project. Once you have those, e.g. from the cancer_immune_3D project, you can File->LoadState and select the physicell_cells_animate.pvsm file. This will show the following pipeline:

If you select the PhysiCellReader_cells (green box; leave its “eye” icon deselected) and click on the Properties tab, you will have access to the Script(RequestInformation):

In there, you will see the lines:

timeSteps = range(1,20,1)   # (start, stop+1, step)
#timeSteps = range(100,801,50)   # (start, stop+1 [, step])

You’ll want to edit the first line, specifying your desired start, stop, and (optionally) step values for your files’ numeric suffixes. We happened to use the second timeSteps line (currently commented out with the ‘#’) for data from our simulation, which resulted in 15 time steps (0-14) for the time values: 100,150,200,…800. Pressing the Play (center) button of the animation icons would animate those frames:

When you like what you see, you can select File->SaveAnimation to save the images. After you provide some minimal information for those images, e.g., resolution and filename prefix, press “OK” and you will see a horizontal progress bar being updated below the render window. After the animation images are generated, you can use your favorite image processing and movie generation tools to post-process them. In Figure 7, we used ImageMagick’s montage command to arrange the 15 images from a cancer_immune_3D simulation. And for generating a movie file, take a look at mplayer/mencoder as one open source option.


Figure 7. Images from cancer_immune_3D

Further Help

For questions specific to PhysiCell, have a look at our User Guide and join our mailing list to ask questions. For ParaView, have a look at their User Guide and join the ParaView mailing list to ask questions.

Thanks!

In closing, we would like to thank Kitware for their terrific open source software and the ParaView community (especially David DeMarle and Utkarsh Ayachit) for being so helpful answering questions and providing insight. We were honored that some of our early results got headlined on Kitware’s home page!

Appendix A: Installing without Admin permission

If you do not have Admin permission on your computer, we provide instructions for installing Python + scipy and ParaView using alternative approaches.

Windows

To install Python on Windows, without Admin permission, run the msiexec command on the .msi Python installer that you downloaded, specifying a (non-system) directory in which it should be installed via the targetdir keyword. After that completes, Python will be installed; however, the python command will not be in your PATH, i.e., it will not be globally accessible.

C:\Users\sue\Downloads>msiexec /a python-2.7.14.amd64.msi /qb targetdir=c:\py27

C:\Users\sue\Downloads>python
'python' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\sue\Downloads>cd c:\py27

c:\py27>.\python.exe
Python 2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 20:25:58) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

Finally, download ParaView-5.4.1-Qt5-OpenGL2-Windows-64bit.zip and extract its contents into a permissible directory, e.g. under your home folder: C:\Users\sue\ParaView-5.4.1-Qt5-OpenGL2-Windows-64bit\

Then, in ParaView, using the Tools menu, open the Python Shell, and see if you can import scipy via:

>>> sys.path.insert(0,'c:\py27\Lib\site-packages')
>>> import scipy

Before proceeding with the tutorial using the PhysiCell state files, you will want to open ParaView’s Settings and check (toggle on) the “Auto Apply” property in the General settings tab.

OSX

OSX comes with a system Python pre-installed. And, at least with fairly recent versions of OSX, it comes with the scipy module also. You can test to see if that is the case on your system:

/usr/bin/python
>>> import scipy

To install ParaView, download a .pkg, not a .dmg, e.g. ParaView-5.4.1-Qt5-OpenGL2-MPI-OSX10.8-64bit.pkg. Then expand that .pkg contents into a permissible directory and untar its “Payload” which contains the actual paraview executable that you can open/run:

cd ~/Downloads
pkgutil --expand ParaView-5.4.1-Qt5-OpenGL2-MPI-OSX10.8-64bit.pkg ~/paraview
cd ~/paraview
tar -xvf Payload
open Contents/MacOS/paraview

Linux

Like OSX, Linux also comes with a system Python pre-installed. However, it does not come with the scipy module. To install the scipy Python module, run:

/usr/bin/python -m pip install scipy --user

To install and run ParaView, download the .tar.gz file, e.g. ParaView-5.4.1-Qt5-OpenGL2-MPI-Linux-64bit.tar.gz and uncompress it into a permissible directory. For example, from your home directory (after downloading):

mv ~/Downloads/ParaView-5.4.1-Qt5-OpenGL2-MPI-Linux-64bit.tar.gz .
tar -xvzf ParaView-5.4.1-Qt5-OpenGL2-MPI-Linux-64bit.tar.gz
cd ParaView-5.4.1-Qt5-OpenGL2-MPI-Linux-64bit/bin
./paraview

Appendix B: Editing the PhysiCellReader_cells Python Script

In case you encounter an error reading your .mat file(s) containing cell data, you will probably need to manually edit the directory path to the file. This is illustrated below in a 4-step process: 1) select the Pipeline Browser tab, 2) select the PhysiCellReader_cells object, 3) select its Properties, then 4) edit the relevant information in the (Python) script. (The script frame is scrollable).

Share this:

Adding a directory to your Windows path

When you’re setting your BioFVM / PhysiCell g++ development environment, you’ll need to add the compiler, MSYS, and your text editor (like Notepad++) to your system path. For example, you may need to add folders like these to your system PATH variable:

  1. c:\Program Files\mingw-w64\x86_64-5.3.0-win32-seh-rt_v4_rev0\mingw64\bin\
  2. c:\Program Files (x86)\Notepad++\
  3. C:\MinGW\msys\1.0\bin\

Here’s how to do that in various versions of Windows.

Windows XP, 7, and 8

First, open up a text editor, and concatenate your three paths into a single block of text, separated by semicolons (;):

  1. Open notepad ([Windows]+R, notepad)
  2. Type a semicolon, paste in the first path, and append a semicolon. It should look like this:
    ;c:\Program Files\mingw-w64\x86_64-5.3.0-win32-seh-rt_v4_rev0\mingw64\bin\;
  3. Paste in the next path, and append a semicolon. It should look like this:
    ;c:\Program Files\mingw-w64\x86_64-5.3.0-win32-seh-rt_v4_rev0\mingw64\bin\;C:\Program Files (x86)\Notepad++\;
  4. Paste in the last path, and append a semicolon. It should look something like this:
    ;c:\Program Files\mingw-w64\x86_64-5.3.0-win32-seh-rt_v4_rev0\mingw64\bin\;C:\Program Files (x86)\Notepad++\;c:\MinGW\msys\1.0\bin\;

Lastly, add these paths to the system path:

  1. Go the Start Menu, the right-click “This PC” or “My Computer”, and choose “Properties.”
  2. Click on “Advanced system settings”
  3. Click on “Environment Variables…” in the “Advanced” tab
  4. Scroll through the “System Variables” below until you find Path.
  5. Select “Path”, then click “Edit…”
  6. At the very end of “Variable Value”, paste what you made in Notepad in the prior steps. Make sure to paste at the end of the existing value, rather than overwriting it!
  7. Hit OK, OK, and OK to completely exit the “Advanced system settings.”

Windows 10:

Windows 10 has made it harder to find these settings, but easier to edit them. First, let’s find the system path:

  1. At the “run / search / Cortana” box next to the start menu, type “view advanced”, and you should see “view advanced system settings” auto-complete:
  2. Click to enter the advanced system settings, then choose environment variables … at the bottom of this box, and scroll down the list of user variables to Path
  3. Click on edit, then click New to add a new path. In the new entry (a new line), paste in your first new path (the compiler):
  4. Repeat this for the other two paths, then click OK, OK, Apply, OK to apply the new paths and exit.
Share this:

2018 Macklin lab speaking schedule

Members of Paul Macklin’s lab are speaking at the following events:

 

Share this:

Working with PhysiCell MultiCellDS digital snapshots in Matlab


PhysiCell 1.2.1 and later saves data as a specialized MultiCellDS digital snapshot, which includes chemical substrate fields, mesh information, and a readout of the cells and their phenotypes at single simulation time point. This tutorial will help you learn to use the matlab processing files included with PhysiCell.

This tutorial assumes you know (1) how to work at the shell / command line of your operating system, and (2) basic plotting and other functions in Matlab.

Key elements of a PhysiCell digital snapshot

A PhysiCell digital snapshot (a customized form of the MultiCellDS digital simulation snapshot) includes the following elements saved as XML and MAT files:

  1. output12345678.xml : This is the “base” output file, in MultiCellDS format. It includes key metadata such as when the file was created, the software, microenvironment information, and custom data saved at the simulation time. The Matlab files read this base file to find other related files (listed next). Example: output00003696.xml
  2. initial_mesh0.mat : This is the computational mesh information for BioFVM at time 0.0. Because BioFVM and PhysiCell do not use moving meshes, we do not save this data at any subsequent time.
  3. output12345678_microenvironment0.mat : This saves each biochemical substrate in the microenvironment at the computational voxels defined in the mesh (see above). Example: output00003696_microenvironment0.mat
  4. output12345678_cells.mat : This saves very basic cellular information related to BioFVM, including cell positions, volumes, secretion rates, uptake rates, and secretion saturation densities. Example: output00003696_cells.mat
  5. output12345678_cells_physicell.mat : This saves extra PhysiCell data for each cell agent, including volume information, cell cycle status, motility information, cell death information, basic mechanics, and any user-defined custom data. Example: output00003696_cells_physicell.mat

These snapshots make extensive use of Matlab Level 4 .mat files, for fast, compact, and well-supported saving of array data. Note that even if you cannot ready MultiCellDS XML files, you can work to parse the .mat files themselves.

The PhysiCell Matlab .m files

Every PhysiCell distribution includes some matlab functions to work with PhysiCell digital simulation snapshots, stored in the matlab subdirectory. The main ones are:

  1. composite_cutaway_plot.m : provides a quick, coarse 3-D cutaway plot of the discrete cells, with different colors for live (red), apoptotic (b), and necrotic (black) cells.
  2. read_MultiCellDS_xml.m : reads the “base” PhysiCell snapshot and its associated matlab files.
  3. set_MCDS_constants.m : creates a data structure MCDS_constants that has the same constants as PhysiCell_constants.h. This is useful for identifying cell cycle phases, etc.
  4. simple_cutaway_plot.m : provides a quick, coarse 3-D cutaway plot of user-specified cells.
  5. simple_plot.m : provides, a quick, coarse 3-D plot of the user-specified cells, without a cutaway or cross-sectional clipping plane.

A note on GNU Octave

Unfortunately, GNU octave does not include XML file parsing without some significant user tinkering. And one you’re done, it is approximately one order of magnitude slower than Matlab. Octave users can directly import the .mat files described above, but without the helpful metadata in the XML file. We’ll provide more information on the structure of these MAT files in a future blog post. Moreover, we plan to provide python and other tools for users without access to Matlab.

A sample digital snapshot

We provide a 3-D simulation snapshot from the final simulation time of the cancer-immune example in Ghaffarizadeh et al. (2017, in review) at:

https://sourceforge.net/projects/physicell/files/Tutorials/MultiCellDS/3D_PhysiCell_matlab_sample.zip/download

The corresponding SVG cross-section for that time (through = 0 μm) looks like this:

Unzip the sample dataset in any directory, and make sure the matlab files above are in the same directory (or in your Matlab path). If you’re inside matlab:

!unzip 3D_PhysiCell_matlab_sample.zip

Loading a PhysiCell MultiCellDS digital snapshot

Now, load the snapshot:

MCDS = read_MultiCellDS_xml( 'output00003696.xml'); 

This will load the mesh, substrates, and discrete cells into the MCDS data structure, and give a basic summary:

Typing ‘MCDS’ and then hitting ‘tab’ (for auto-completion) shows the overall structure of MCDS, stored as metadata, mesh, continuum variables, and discrete cells:

To get simulation metadata, such as the current simulation time, look at MCDS.metadata.current_time

Here, we see that the current simulation time is 30240 minutes, or 21 days. MCDS.metadata.current_runtime gives the elapsed walltime to up to this point: about 53 hours (1.9e5 seconds), including file I/O time to write full simulation data once per 3 simulated minutes after the start of the adaptive immune response.

Plotting chemical substrates

Let’s make an oxygen contour plot through z = 0 μm. First, we find the index corresponding to this z-value:

k = find( MCDS.mesh.Z_coordinates == 0 ); 

Next, let’s figure out which variable is oxygen. Type “MCDS.continuum_variables.name”, which will show the array of variable names:

Here, oxygen is the first variable, (index 1). So, to make a filled contour plot:

contourf( MCDS.mesh.X(:,:,k), MCDS.mesh.Y(:,:,k), ...
     MCDS.continuum_variables(1).data(:,:,k) , 20 ) ;

Now, let’s set this to a correct aspect ratio (no stretching in x or y), add a colorbar, and set the axis labels, using
metadata to get labels:

axis image
colorbar 
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 

Lastly, let’s add an appropriate (time-based) title:

title( sprintf('%s (%s) at t = %3.2f %s, z = %3.2f %s', MCDS.continuum_variables(1).name , ...
     MCDS.continuum_variables(1).units , ...
     MCDS.metadata.current_time , ...
     MCDS.metadata.time_units, ... 
     MCDS.mesh.Z_coordinates(k), ...
     MCDS.metadata.spatial_units ) ); 

Here’s the end result:

We can easily export graphics, such as to PNG format:

print( '-dpng' , 'output_o2.png' );

For more on plotting BioFVM data, see the tutorial
at http://www.mathcancer.org/blog/saving-multicellds-data-from-biofvm/

Plotting cells in space

3-D point cloud

First, let’s plot all the cells in 3D:

plot3( MCDS.discrete_cells.state.position(:,1) , MCDS.discrete_cells.state.position(:,2), ...
	MCDS.discrete_cells.state.position(:,3) , 'bo' ); 

At first glance, this does not look good: some cells are far out of the simulation domain, distorting the automatic range of the plot:

This does not ordinarily happen in PhysiCell (the default cell mechanics functions have checks to prevent such behavior), but this example includes a simple Hookean elastic adhesion model for immune cell attachment to tumor cells. In rare circumstances, an attached tumor cell or immune cell can apoptose on its own (due to its background apoptosis rate),
without “knowing” to detach itself from the surviving cell in the pair. The remaining cell attempts to calculate its elastic velocity based upon an invalid cell position (no longer in memory), creating an artificially large velocity that “flings” it out of the simulation domain. Such cells  are not simulated any further, so this is effectively equivalent to an extra apoptosis event (only 3 cells are out of the simulation domain after tens of millions of cell-cell elastic adhesion calculations). Future versions of this example will include extra checks to prevent this rare behavior.

The plot can simply be fixed by changing the axis:

axis( 1000*[-1 1 -1 1 -1 1] )
axis square 

Notice that this is a very difficult plot to read, and very non-interactive (laggy) to rotation and scaling operations. We can make a slightly nicer plot by searching for different cell types and plotting them with different colors:

% make it easier to work with the cell positions; 
P = MCDS.discrete_cells.state.position;

% find type 1 cells
ind1 = find( MCDS.discrete_cells.metadata.type == 1 ); 
% better still, eliminate those out of the simulation domain 
ind1 = find( MCDS.discrete_cells.metadata.type == 1 & ...
    abs(P(:,1))' < 1000 & abs(P(:,2))' < 1000 & abs(P(:,3))' < 1000 );

% find type 0 cells
ind0 = find( MCDS.discrete_cells.metadata.type == 0 & ...
    abs(P(:,1))' < 1000 & abs(P(:,2))' < 1000 & abs(P(:,3))' < 1000 ); 

%now plot them
P = MCDS.discrete_cells.state.position;
plot3( P(ind0,1), P(ind0,2), P(ind0,3), 'bo' )
hold on
plot3( P(ind1,1), P(ind1,2), P(ind1,3), 'ro' )
hold off
axis( 1000*[-1 1 -1 1 -1 1] )
axis square

However, this isn’t much better. You can use the scatter3 function to gain more control on the size and color of the plotted cells, or even make macros to plot spheres in the cell locations (with shading and lighting), but Matlab is very slow when plotting beyond 103 cells. Instead, we recommend the faster preview functions below for data exploration, and higher-quality plotting (e.g., by POV-ray) for final publication-

Fast 3-D cell data previewers

Notice that plot3 and scatter3 are painfully slow for any nontrivial number of cells. We can use a few fast previewers to quickly get a sense of the data. First, let’s plot all the dead cells, and make them red:

clf
simple_plot( MCDS,  MCDS, MCDS.discrete_cells.dead_cells , 'r' )


This function creates a coarse-grained 3-D indicator function (0 if no cells are present; 1 if they are), and plots a 3-D level surface. It is very responsive to rotations and other operations to explore the data. You may notice the second argument is a list of indices: only these cells are plotted. This gives you a method to select cells with specific characteristics when plotting. (More on that below.) If you want to get a sense of the interior structure, use a cutaway plot:

clf
simple_cutaway_plot( MCDS, MCDS, MCDS.discrete_cells.dead_cells , 'r' )

We also provide a fast “composite” cutaway which plots all live cells as red, apoptotic cells as blue (without the cutaway), and all necrotic cells as black:

clf
composite_cutaway_plot( MCDS )


Lastly, we show an improved plot that uses different colors for the immune cells, and Matlab’s “find” function to help set up the indexing:

constants = set_MCDS_constants

% find the type 0 necrotic cells
ind0_necrotic = find( MCDS.discrete_cells.metadata.type == 0 & ...
    (MCDS.discrete_cells.phenotype.cycle.current_phase == constants.necrotic_swelling | ...
    MCDS.discrete_cells.phenotype.cycle.current_phase == constants.necrotic_lysed | ...
    MCDS.discrete_cells.phenotype.cycle.current_phase == constants.necrotic) ); 

% find the live type 0 cells
ind0_live = find( MCDS.discrete_cells.metadata.type == 0 & ...
    (MCDS.discrete_cells.phenotype.cycle.current_phase ~= constants.necrotic_swelling & ...
    MCDS.discrete_cells.phenotype.cycle.current_phase ~= constants.necrotic_lysed & ...
    MCDS.discrete_cells.phenotype.cycle.current_phase ~= constants.necrotic & ...
    MCDS.discrete_cells.phenotype.cycle.current_phase ~= constants.apoptotic) ); 

clf
% plot live tumor cells red, in cutaway view
simple_cutaway_plot( MCDS, ind0_live , 'r' ); 
hold on 
% plot dead tumor cells black, in cutaway view 
simple_cutaway_plot( MCDS, ind0_necrotic , 'k' ) 
% plot all immune cells, but without cutaway (to show how they infiltrate)
simple_plot( MCDS, ind1, 'g' ) 
hold off

A small cautionary note on future compatibility

PhysiCell 1.2.1 uses the <custom> data tag (allowed as part of the MultiCellDS specification) to encode its cell data, to allow a more compact data representation, because the current PhysiCell daft does not support such a formulation, and Matlab is painfully slow at parsing XML files larger than ~50 MB. Thus, PhysiCell snapshots are not yet fully compatible with general MultiCellDS tools, which would by default ignore custom data. In the future, we will make available converter utilities to transform “native” custom PhysiCell snapshots to MultiCellDS snapshots that encode all the cellular information in a more verbose but compatible XML format.

Closing words and future work

Because Octave is not a great option for parsing XML files (with critical MultiCellDS metadata), we plan to write similar functions to read and plot PhysiCell snapshots in Python, as an open source alternative. Moreover, our lab in the next year will focus on creating further MultiCellDS configuration, analysis, and visualization routines. We also plan to provide additional 3-D functions for plotting the discrete cells and varying color with their properties.

In the longer term, we will develop open source, stand-alone analysis and visualization tools for MultiCellDS snapshots (including PhysiCell snapshots). Please stay tuned!

Share this:
Tags :

Frequently Asked Questions (FAQs) for Building PhysiCell

Here, we document common problems and solutions in compiling and running PhysiCell projects.

Compiling Errors

I get the error “clang: error: unsupported option ‘-fopenmp’” when I compile a PhysiCell Project

When compiling a PhysiCell project in OSX, you may see an error like this:

This shows that clang is being used as the compiler, instead of g++. If you are using PhysiCell 1.2.2 or later, fix this error by setting the PHYSICELL_CPP environment variable. If you installed by Homebrew:

echo export PATH=/usr/local/bin:$PATH >> ~/.bash_profile

If you installed by MacPorts:

echo export PHYSICELL_CPP=g++-mp-7 >> ~/.bash_profile

To fix this error in earlier versions of PhysiCell (1.2.1 or earlier), edit your Makefile, and fix the CC line.

CC := g++-7 # use this if you installed g++ 7 by Homebrew
or
CC := g++-mp-7 # use this if you installed g++ 7 by MacPorts

If you have not installed g++ by MacPorts or Homebrew, please see the following tutorials:

When I compile, I get tons of weird “no such instruction” errors like
no such instruction: `vmovsd (%rdx,%rax), #xmm0′” or
no such instruction: `vfnmadd132sd (%rsi,%rax,8), $xmm5,%xmm0′

When you compile, you may see a huge list of arcane symbols, like this:

The “no such instruction” means that the compiler is trying to send CPU instructions (like these vmovsd lines) that your system doesn’t understand. (It’s like trying SSE4 instructions on a Pentium 4.) The first solution to try is to use a safer architecture, using the ARCH definition. Open your Makefile, and search for the ARCH line. If it isn’t set to native, then try that. If it is already set to native, try a safer, older architecture like core2 by commenting out native (add a # symbol to any Makefile line to comment it out), and uncommenting the core2 line:

# ARCH := native
ARCH := core2

“I don’t understand the ‘m’ flag!”

When you compile a project, you may see an error that looks like this:

This seems to be due to incompatibilities between MacPort’s gcc chain and Homebrew (especially if you installed gcc5 in MacPorts). As we showed in this tutorial, you can open a terminal window and run a single command to fix it:

echo export PATH=/usr/bin:$PATH >> ~/.profile

Note that you’ll need to restart your terminal to fully apply the fix.

Errors Running PhysiCell

My project compiles fine, but when I run it, I get errors like “illegal instruction: 4“.

This means that PhysiCell has been compiled for the wrong architecture, and it is sending unsupported instructions to your CPU. See the “no such instruction” error above for a fix.

I fixed my Makefile, and things compiled fine, but I can’t compile a different project or the sample projects.

For PhysiCell 1.2.2 or later:
The Makefile rules for the sample projects (e.g., make biorobots-sample) overwrite the Makefile in the PhysiCell root directory, so you’ll need to return to the original state to re-populate with a new
sample project. Use

make reset

and then you’ll be good to go. As promised (below), we updated PhysiCell so that OSX users don’t need to fix the CC line for every single Makefile.

For PhysiCell 1.2.1 and earlier:
The Makefile rules for the sample projects (e.g., make biorobots-sample) overwrite the Makefile in the PhysiCell root directory, so you’ll need to re-modify the Makefile with the correct CC (and potentially ARCH) lines any time you run a template project or sample project make rule. This will be improved in future editions of PhysiCell. Sorry!!

It compiled fine, but the project crashes with “Segmentation fault: 11″, or the program just crashes with “killed.”

Everything compiles just fine and your program starts, but you may get a segmentation fault either early on, or later in your simulation. Like this:

[Screenshot soon! This error is rare.]

Or on Linux systems it might just crash with a simple “killed” message:

This error occurs if there is not enough (contiguous) memory to run a project. If you are running in a Virtual Machine, you can solve this by increasing the amount of memory. If you are running “natively” you may need to install more RAM or decrease the problem size (the size of the simulation domain or the number of cells). To date, we have only encountered this error on virtual machines with little memory. We recommend using 8192 MB (8 GB):

Share this:

Running the PhysiCell sample projects

Introduction

In PhysiCell 1.2.1 and later, we include four sample projects on cancer heterogeneity, bioengineered multicellular systems, and cancer immunology. This post will walk you through the steps to build and run the examples.

If you are new to PhysiCell, you should first make sure you’re ready to run it. (Please note that this applies in particular for OSX users, as Xcode’s g++ is not compatible out-of-the-box.) Here are tutorials on getting ready to Run PhysiCell:

  1. Setting up a 64-bit gcc environment in Windows.
  2. Setting up gcc / OpenMP on OSX (MacPorts edition)
  3. Setting up gcc / OpenMP on OSX (Homebrew edition)
    Note: This is the preferred method for Mac OSX.
  4. Getting started with a PhysiCell Virtual Appliance (for virtual machines like VirtualBox)
    Note: The “native” setups above are preferred, but the Virtual Appliance is a great “plan B” if you run into trouble

Please note that we expect to expand this tutorial.

Building, running, and viewing the sample projects

All of these projects will create data of the following forms:

  1. Scalable vector graphics (SVG) cross-section plots through = 0.0 μm at each output time. Filenames will look like snapshot00000000.svg.
  2. Matlab (Level 4) .mat files to store raw BioFVM data. Filenames will look like output00000000_microenvironment0.mat (for the chemical substrates) and output00000000_cells.mat (for basic agent data).
  3. Matlab .mat files to store additional PhysiCell agent data. Filenames will look like output00000000_cells_physicell.mat.
  4. MultiCellDS .xml files that give further metadata and structure for the .mat files. Filenames will look like output00000000.xml.

You can read the combined data in the XML and MAT files with the read_MultiCellDS_xml function, stored in the matlab directory of every PhysiCell download. (Copy the read_MultiCellDS_xml.m and set_MultiCelLDS_constants.m files to the same directory as your data for the greatest simplicity.)

(If you are using Mac OSX and PhysiCell version > 1.2.1, remember to set the PHYSICELL_CPP environment variable to be an OpenMP-capable compiler – rf. Homebrew setup.)

 Biorobots (2D)

Type the following from a terminal window in your root PhysiCell directory:

make biorobots-sample
make 
./biorobots
make reset # optional -- gets a clean slate to try other samples

Because this is a 2-D example, the SVG snapshot files will provide the simplest method of visualizing these outputs. You can use utilities like ImageMagick to convert them into other formats for publications, such as PNG or EPS.

Anti-cancer biorobots (2D)

make cancer-biorobots-sample
make 
./cancer_biorobots
make reset # optional -- gets a clean slate to try other samples 

Cancer heterogeneity (2D)

make heterogeneity-sample
make project
./heterogeneity
make reset # optional -- gets a clean slate to try other samples 

Cancer immunology (3D)

make cancer-immune-sample
make 
./cancer_immune_3D
make reset # optional -- gets a clean slate to try other samples 

 

Share this:

A small computational thought experiment

In Macklin (2017), I briefly touched on a simple computational thought experiment that shows that for a group of homogeneous cells, you can observe substantial heterogeneity in cell behavior. This “thought experiment” is part of a broader preview and discussion of a fantastic paper by Linus Schumacher, Ruth Baker, and Philip Maini published in Cell Systems, where they showed that a migrating collective homogeneous cells can show heterogeneous behavior when quantitated with new migration metrics. I highly encourage you to check out their work!

In this blog post, we work through my simple thought experiment in a little more detail.

Note: If you want to reference this blog post, please cite the Cell Systems preview article:

P. Macklin, When seeing isn’t believing: How math can guide our interpretation of measurements and experiments. Cell Sys., 2017 (in press). DOI: 10.1016/j.cells.2017.08.005

The thought experiment

Consider a simple (and widespread) model of a population of cycling cells: each virtual cell (with index i) has a single “oncogene” \( r_i \) that sets the rate of progression through the cycle. Between now (t) and a small time from now ( \(t+\Delta t\)), the virtual cell has a probability \(r_i \Delta t\) of dividing into two daughter cells. At the population scale, the overall population growth model that emerges from this simple single-cell model is:
\[\frac{dN}{dt} = \langle r\rangle N, \]
where \( \langle r \rangle \) the mean division rate over the cell population, and is the number of cells. See the discussion in the supplementary information for Macklin et al. (2012).

Now, suppose (as our thought experiment) that we could track individual cells in the population and track how long it takes them to divide. (We’ll call this the division time.) What would the distribution of cell division times look like, and how would it vary with the distribution of the single-cell rates \(r_i\)?

Mathematical method

In the Matlab script below, we implement this cell cycle model as just about every discrete model does. Here’s the pseudocode:

t = 0; 
while( t < t_max )
    for i=1:Cells.size()
        u = random_number();
        if( u < Cells[i].birth_rate * dt )
            Cells[i].division_time = Cells[i].age; 
            Cells[i].divide();
        end
    end
    t = t+dt; 
end

That is, until we’ve reached the final simulation time, loop through all the cells and decide if they should divide: For each cell, choose a random number between 0 and 1, and if it’s smaller than the cell’s division probability (\(r_i \Delta t\)), then divide the cell and write down the division time.

As an important note, we have to track the same cells until they all divide, rather than merely record which cells have divided up to the end of the simulation. Otherwise, we end up with an observational bias that throws off our recording. See more below.

The sample code

You can download the Matlab code for this example at:

http://MathCancer.org/files/matlab/thought_experiment_matlab(Macklin_Cell_Systems_2017).zip

Extract all the files, and run “thought_experiment” in Matlab (or Octave, if you don’t have a Matlab license or prefer an open source platform) for the main result.

All these Matlab files are available as open source, under the GPL license (version 3 or later).

Results and discussion

First, let’s see what happens if all the cells are identical, with \(r = 0.05 \textrm{ hr}^{-1}\). We run the script, and track the time for each of 10,000 cells to divide. As expected by theory (Macklin et al., 2012) (but perhaps still a surprise if you haven’t looked), we get an exponential distribution of division times, with mean time \(1/\langle r \rangle\):

So even in this simple model, a homogeneous population of cells can show heterogeneity in their behavior. Here’s the interesting thing: let’s now give each cell its own division parameter \(r_i\) from a normal distribution with mean \(0.05 \textrm{ hr}^{-1}\) and a relative standard deviation of 25%:

If we repeat the experiment, we get the same distribution of cell division times!

So in this case, based solely on observations of the phenotypic heterogeneity (the division times), it is impossible to distinguish a “genetically” homogeneous cell population (one with identical parameters) from a truly heterogeneous population. We would require other metrics, like tracking changes in the mean division time as cells with a higher \(r_i\) out-compete the cells with lower \(r_i\).

Lastly, I want to point out that caution is required when designing these metrics and single-cell tracking. If instead we had tracked all cells throughout the simulated experiment, including new daughter cells, and then recorded the first 10,000 cell division events, we would get a very different distribution of cell division times:

By only recording the division times for the cells that have divided, and not those that haven’t, we bias our observations towards cells with shorter division times. Indeed, the mean division time for this simulated experiment is far lower than we would expect by theory. You can try this one by running “bad_thought_experiment”.

Further reading

This post is an expansion of our recent preview in Cell Systems in Macklin (2017):

P. Macklin, When seeing isn’t believing: How math can guide our interpretation of measurements and experiments. Cell Sys., 2017 (in press). DOI: 10.1016/j.cells.2017.08.005

And the original work on apparent heterogeneity in collective cell migration is by Schumacher et al. (2017):

L. Schumacher et al., Semblance of Heterogeneity in Collective Cell MigrationCell Sys., 2017 (in press). DOI: 10.1016/j.cels.2017.06.006

You can read some more on relating exponential distributions and Poisson processes to common discrete mathematical models of cell populations in Macklin et al. (2012):

P. Macklin, et al., Patient-calibrated agent-based modelling of ductal carcinoma in situ (DCIS): From microscopic measurements to macroscopic predictions of clinical progressionJ. Theor. Biol. 301:122-40, 2012. DOI: 10.1016/j.jtbi.2012.02.002.

Lastly, I’d be delighted if you took a look at the open source software we have been developing for 3-D simulations of multicellular systems biology:

http://OpenSource.MathCancer.org

And you can always keep up-to-date by following us on Twitter: @MathCancer.

Share this:

Getting started with a PhysiCell Virtual Appliance

Note: This is part of a series of “how-to” blog posts to help new users and developers of BioFVM and PhysiCell. This guide is for for users in OSX, Linux, or Windows using the VirtualBox virtualization software to run a PhysiCell virtual appliance.

These instructions should get you up and running without needed to install a compiler, makefile capabilities, or any other software (beyond the virtual machine and the PhysiCell virtual appliance). We note that using the PhysiCell source with your own compiler is still the preferred / ideal way to get started, but the virtual appliance option is a fast way to start even if you’re having troubles setting up your development environment.

What’s a Virtual Machine? What’s a Virtual Appliance?

A virtual machine is a full simulated computer (with its own disk space, operating system, etc.) running on another. They are designed to let a user test on a completely different environment, without affecting the host (main) environment. They also allow a very robust way of taking and reproducing the state of a full working environment.

A virtual appliance is just this: a full image of an installed system (and often its saved state) on a virtual machine, which can easily be installed on a new virtual machine. In this tutorial, you will download our PhysiCell virtual appliance and use its pre-configured compiler and other tools.

What you’ll need:

  • VirtualBox: This is a free, cross-platform program to run virtual machines on OSX, Linux, Windows, and other platforms. It is a safe and easy way to install one full operating (a client system) on your main operating system (the host system). For us, this means that we can distribute a fully working Linux environment with a working copy of all the tools you need to compile and run PhysiCell. As of August 1, 2017, this will download Version 5.1.26.
  • PhysiCell Virtual Appliance: This is a single-file distribution of a virtual machine running Alpine Linux, including all key tools needed to compile and run PhysiCell. As of July 31, 2017, this will download PhysiCell 1.2.2 with g++ 6.3.0.
  • A computer with hardware support for virtualization: Your CPU needs to have hardware support for virtualization (almost all of them do now), and it has to be enabled in your BIOS. Consult your computer maker on how to turn this on if you get error messages later.

Main steps:

1) Install VirtualBox.

Double-click / open the VirtualBox download. Go ahead and accept all the default choices. If asked, go ahead and download/install the extensions pack.

2) Import the PhysiCell Virtual Appliance

Go the “File” menu and choose “Import Virtual Appliance”. Browse to find the .ova file you just downloaded.

Click on “Next,” and import with all the default options. That’s it!

3) [Optional] Change settings

You most likely won’t need this step, but you can increase/decrease the amount of RAM used for the virtual machine if you select the PhysiCell VM, click the Settings button (orange gear), and choose “System”:We set the Virtual Machine to have 4 GB of RAM. If you have a machine with lots of RAM (16 GB or more), you may want to set this to 8 GB.

Also, you can choose how many virtual CPUs to give to your VM: 

We selected 4 when we set up the Virtual Appliance, but you should match the number of physical processor cores on your machine. In my case, I have a quad core processor with hyperthreading. This means 4 real cores, 8 virtual cores, so I select 4 here.

4) Start the Virtual Machine and log in

Select the PhysiCell machine, and click the green “start” button. After the virtual machine boots (with the good old LILO boot manager that I’ve missed), you should see this:

Click the "More ..." button, and log in with username: physicell, password: physicell

5) Test the compiler and run your first simulation

Notice that PhysiCell is already there on the desktop in the PhysiCell folder. Right-click, and choose “open terminal here.” You’ll already be in the main PhysiCell root directory. 

Now, let’s compile your first project! Type “make template2D && make” And run your project! Type “./project” and let it go!Go ahead and run either the first few days of the simulation (until about 7200 minutes), then hit <control>-C to cancel out. Or run the whole simulation–that’s fine, too.

6) Look at the results

We bundled a few tools to easily look at results. First, ristretto is a very fast image viewer. Let’s view the SVG files: As a nice tip, you can press the left and right arrows to advance through the SVG images, or hold the right arrow down to advance through quickly.

Now, let’s use ImageMagick to convert the SVG files into JPG file: call “magick mogrify -format jpg snap*.svg”

Next, let’s turn those images into a movie. I generally create moves that are 24 frames pers se, so that 1 second of the movie is 1 hour of simulations time. We’ll use mencoder, with options below given to help get a good quality vs. size tradeoff:

When you’re done, view the movie with mplayer. The options below scale the window to fit within the virtual monitor:

If you want to loop the movie, add “-loop 999” to your command.

7) Get familiar with other tools

Use nano (useage: nano <filename>) to quickly change files at the command line. Hit <control>-O to save your results. Hit <control>-X to exit.  <control>-W will search within the file.

Use nedit (useage: nedit <filename> &) to open up one more text files in a graphical editor. This is a good way to edit multiple files at once.

Sometimes, you need to run commands at elevated (admin or root) privileges. Use sudo. Here’s an example, searching the Alpine Linux package manager apk for clang:

physicell:~$ sudo apk search gcc
[sudo] password for physicell:  
physicell:~$ sudo apk search clang
clang-analyzer-4.0.0-r0
clang-libs-4.0.0-r0
clang-dev-4.0.0-r0
clang-static-4.0.0-r0
emscripten-fastcomp-1.37.10-r0
clang-doc-4.0.0-r0
clang-4.0.0-r0
physicell:~/Desktop/PhysiCell$ 

If you want to install clang/llvm (as an alternative compiler):

physicell:~$ sudo apk add gcc
[sudo] password for physicell:  
physicell:~$ sudo apk search clang
clang-analyzer-4.0.0-r0
clang-libs-4.0.0-r0
clang-dev-4.0.0-r0
clang-static-4.0.0-r0
emscripten-fastcomp-1.37.10-r0
clang-doc-4.0.0-r0
clang-4.0.0-r0
physicell:~/Desktop/PhysiCell$ 

Notice that it asks for a password: use the password for root (which is physicell).

8) [Optional] Configure a shared folder

Coming soon.

Why both with zipped source, then?

Given that we can get a whole development environment by just downloading and importing a virtual appliance, why
bother with all the setup of a native development environment, like this tutorial (Windows) or this tutorial (Mac)?

One word: performance. In my testing, I still have not found the performance running inside a
virtual machine to match compiling and running directly on your system. So, the Virtual Appliance is a great
option to get up and running quickly while trying things out, but I still recommend setting up natively with
one of the tutorials I linked in the preceding paragraphs.

What’s next?

In the coming weeks, we’ll post further tutorials on using PhysiCell. In the meantime, have a look at the
PhysiCell project website, and these links as well:

  1. BioFVM on MathCancer.org: http://BioFVM.MathCancer.org
  2. BioFVM on SourceForge: http://BioFVM.sf.net
  3. BioFVM Method Paper in BioInformatics: http://dx.doi.org/10.1093/bioinformatics/btv730
  4. PhysiCell on MathCancer.org: http://PhysiCell.MathCancer.org
  5. PhysiCell on Sourceforge: http://PhysiCell.sf.net
  6. PhysiCell Method Paper (preprint): https://doi.org/10.1101/088773
  7. PhysiCell tutorials: [click here]

Return to NewsReturn to MathCancerFollow @MathCancer
Share this:

MathCancer C++ Style and Practices Guide

As PhysiCell, BioFVM, and other open source projects start to gain new users and contributors, it’s time to lay out a coding style. We have three goals here:

  1. Consistency: It’s easier to understand and contribute to the code if it’s written in a consistent way.
  2. Readability: We want the code to be as readable as possible.
  3. Reducing errors: We want to avoid coding styles that are more prone to errors. (e.g., code that can be broken by introducing whitespace).

So, here is the guide (revised June 2017). I expect to revise this guide from time to time.

Place braces on separate lines in functions and classes.

I find it much easier to read a class if the braces are on separate lines, with good use of whitespace. Remember: whitespace costs almost nothing, but reading and understanding (and time!) are expensive.

DON’T

class Cell{
public:
 double some_variable; 
 bool some_extra_variable;

 Cell(); };

class Phenotype{
public:
 double some_variable; 
 bool some_extra_variable;

 Phenotype();
};

DO:

class Cell
{
 public:
     double some_variable; 
     bool some_extra_variable;

     Cell(); 
};

class Phenotype
{
 public:
     double some_variable; 
     bool some_extra_variable;

     Phenotype();
};

Enclose all logic in braces, even when optional.

In C/C++, you can omit the curly braces in some cases. For example, this is legal

if( distance > 1.5*cell_radius )
     interaction = false; 
force = 0.0; // is this part of the logic, or a separate statement?
error = false; 

However, this code is ambiguous to interpret. Moreover, small changes to whitespace–or small additions to the logic–could mess things up here. Use braces to make the logic crystal clear:

DON’T

if( distance > 1.5*cell_radius )  
     interaction = false; 
force = 0.0; // is this part of the logic, or a separate statement?
error = false; 

if( condition1 == true )
  do_something1 = true; 
elseif( condition2 == true )
  do_something2 = true;
else
  do_something3 = true; 

DO

if( distance > 1.5*cell_radius )  
{
     interaction = false; 
     force = 0.0;
}
error = false; 

if( condition1 == true )
{ do_something1 = true; }
elseif( condition2 == true )
{ do_something2 = true; }
else
{ do_something3 = true; }

Put braces on separate lines in logic, except for single-line logic.

This style rule relates to the previous point, to improve readability.

DON’T

if( distance > 1.5*cell_radius ){ 
     interaction = false;
     force = 0.0; } 

if( condition1 == true ){ do_something1 = true; }
elseif( condition2 == true ){ 
  do_something2 = true; }
else
{ do_something3 = true; error = true; }

DO

if( distance > 1.5*cell_radius )
{ 
     interaction = false;
     force = 0.0;
} 

if( condition1 == true )
{ do_something1 = true; } // this is fine
elseif( condition2 == true )
{ 
     do_something2 = true; // this is better
}
else
{
     do_something3 = true;
     error = true;
}

See how much easier that code is to read? The logical structure is crystal clear, and adding more to the logic is simple.

End all functions with a return, even if void.

For clarity, definitively state that a function is done by using return.

DON’T

void my_function( Cell& cell )
{
     cell.phenotype.volume.total *= 2.0; 
     cell.phenotype.death.rates[0] = 0.02;
     // Are we done, or did we forget something?
     // is somebody still working here? 
}

DO

void my_function( Cell& cell )
{
     cell.phenotype.volume.total *= 2.0; 
     cell.phenotype.death.rates[0] = 0.02;
     return; 
}

Use tabs to indent the contents of a class or function.

This is to make the code easier to read. (Unfortunately PHP/HTML makes me use five spaces here instead of tabs.)

DON’T

class Secretion
{
 public:
std::vector<double> secretion_rates;
std::vector<double> uptake_rates; 
std::vector<double> saturation_densities; 
};

void my_function( Cell& cell )
{
cell.phenotype.volume.total *= 2.0; 
cell.phenotype.death.rates[0] = 0.02;
return; 
}

DO

class Secretion
{
 public:
     std::vector<double> secretion_rates;
     std::vector<double> uptake_rates; 
     std::vector<double> saturation_densities; 
};

void my_function( Cell& cell )
{
     cell.phenotype.volume.total *= 2.0; 
     cell.phenotype.death.rates[0] = 0.02;
     return; 
}

Use a single space to indent public and other keywords in a class.

This gets us some nice formatting in classes, without needing two tabs everywhere.

DON’T

class Secretion
{
public:
std::vector<double> secretion_rates;
std::vector<double> uptake_rates; 
std::vector<double> saturation_densities; 
}; // not enough whitespace

class Errors
{
     private:
          std::string none_of_your_business
     public:
          std::string error_message;
          int error_code; 
}; // too much whitespace!

DO

class Secretion
{
 private:
 public:
     std::vector<double> secretion_rates;
     std::vector<double> uptake_rates; 
     std::vector<double> saturation_densities; 
}; 

class Errors
{
 private:
     std::string none_of_your_business
 public:
     std::string error_message;
     int error_code; 
};

Avoid arcane operators, when clear logic statements will do.

It can be difficult to decipher code with statements like this:

phenotype.volume.fluid=phenotype.volume.fluid<0?0:phenotype.volume.fluid;

Moreover, C and C++ can treat precedence of ternary operators very differently, so subtle bugs can creep in when using the “fancier” compact operators. Variations in how these operators work across languages are an additional source of error for programmers switching between languages in their daily scientific workflows. Wherever possible (and unless there is a significant performance reason to do so), use clear logical structures that are easy to read even if you only dabble in C/C++. Compiler-time optimizations will most likely eliminate any performance gains from these goofy operators.

DON’T

// if the fluid volume is negative, set it to zero
phenotype.volume.fluid=phenotype.volume.fluid<0.0?0.0:pCell->phenotype.volume.fluid;

DO

if( phenotype.volume.fluid < 0.0 )
{
     phenotype.volume.fluid = 0.0;
}

Here’s the funny thing: the second logic is much clearer, and it took fewer characters, even with extra whitespace for readability!

Pass by reference where possible.

Passing by reference is a great way to boost performance: we can avoid (1) allocating new temporary memory, (2) copying data into the temporary memory, (3) passing the temporary data to the function, and (4) deallocating the temporary memory once finished.

DON’T

double some_function( Cell cell )
{
     return = cell.phenotype.volume.total + 3.0; 
}
// This copies cell and all its member data!

DO

double some_function( Cell& cell )
{
     return = cell.phenotype.volume.total + 3.0; 
}
// This just accesses the original cell data without recopying it. 

Where possible, pass by reference instead of by pointer.

There is no performance advantage to passing by pointers over passing by reference, but the code is simpler / clearer when you can pass by reference. It makes code easier to write and understand if you can do so. (If nothing else, you save yourself character of typing each time you can replace “->” by “.”!)

DON’T

double some_function( Cell* pCell )
{
     return = pCell->phenotype.volume.total + 3.0; 
}
// Writing and debugging this code can be error-prone.

DO

double some_function( Cell& cell )
{
     return = cell.phenotype.volume.total + 3.0; 
}
// This is much easier to write. 

Be careful with static variables. Be thread safe!

PhysiCell relies heavily on parallelization by OpenMP, and so you should write functions under the assumption that they may be executed many times simultaneously. One easy source of errors is in static variables:

DON’T

double some_function( Cell& cell )
{
     static double four_pi = 12.566370614359172; 
     static double output; 
     output = cell.phenotype.geometry.radius; 
     output *= output; 
     output *= four_pi; 
     return output; 
}
// If two instances of some_function are running, they will both modify 
// the *same copy* of output 

DO

double some_function( Cell& cell )
{
     static double four_pi = 12.566370614359172; 
     double output; 
     output = cell.phenotype.geometry.radius; 
     output *= output; 
     output *= four_pi; 
     return output; 
}
// If two instances of some_function are running, they will both modify 
// the their own copy of output, but still use the more efficient, once-
// allocated copy of four_pi. This one is safe for OpenMP.

Use std:: instead of “using namespace std”

PhysiCell uses the BioFVM and PhysiCell namespaces to avoid potential collision with other codes. Other codes using PhysiCell may use functions that collide with the standard namespace. So, we formally use std:: whenever using functions in the standard namespace.

DON’T

using namespace std; 

cout << "Hi, Mom, I learned to code today!" << endl; 
string my_string = "Cheetos are good, but Doritos are better."; 
cout << my_string << endl; 

vector<double> my_vector;
vector.resize( 3, 0.0 ); 

DO

std::cout << "Hi, Mom, I learned to code today!" << std::endl; 
std::string my_string = "Cheetos are good, but Doritos are better."; 
std::cout << my_string << std::endl; 

std::vector<double> my_vector;
my_vector.resize( 3, 0.0 ); 

Camelcase is ugly. Use underscores.

This is purely an aesthetic distinction, but CamelCaseCodeIsUglyAndDoYouUseDNAorDna?

DON’T

double MyVariable1;
bool ProteinsInExosomes;
int RNAtranscriptionCount;

void MyFunctionDoesSomething( Cell& ImmuneCell );

DO

double my_variable1;
bool proteins_in_exosomes;
int RNA_transcription_count;

void my_function_does_something( Cell& immune_cell );

Use capital letters to declare a class. Use lowercase for instances.

To help in readability and consistency, declare classes with capital letters (but no camelcase), and use lowercase for instances of those classes.

DON’T

class phenotype; 

class cell
{
 public:
     std::vector<double> position; 
     phenotype Phenotype; 
}; 

class ImmuneCell : public cell
{
 public:
     std::vector<double> surface_receptors; 
};

void do_something( cell& MyCell , ImmuneCell& immuneCell ); 

cell Cell;
ImmuneCell MyImmune_cell;

do_something( Cell, MyImmune_cell ); 

DO

class Phenotype;

class Cell
{
 public:
     std::vector<double> position; 
     Phenotype phenotype; 
}; 

class Immune_Cell : public Cell
{
 public:
     std::vector<double> surface_receptors; 
};

void do_something( Cell& my_cell , Immune_Cell& immune_cell ); 

Cell cell;
Immune_Cell my_immune_cell;

do_something( cell, my_immune_cell ); 
Share this:

2017 Macklin Lab speaking schedule

Members of Paul Macklin’s lab are speaking at the following events:

Share this:
Tags : ,