The Big Algal Open Experiment

We set up the Big Open Algae Experiment to help us enhance our knowledge by performing the biggest parallel algae experiment in history. We are inviting universities and citizen scientists to participate in an open-source data collection experiment on outdoor microalgal growth.

The Idea

Microalgae are tremendously diverse, inhabiting almost every biome of the planet. Estimates suggest that there are over 50,000 microalgal species1, representing a rich resource for industrial biotechnology and an immense library of biosynthetic components which can be unlocked with synthetic biology.

Despite this promise, the potential of these “tiny plants” remains unrealised due to a lack of knowledge on how to predictably cultivate microalgae and benefit from this biodiversity. While algae could play a role as a crop of the future, this needs to be enabled by building up a mass of data and know-how (UK Roadmap for Algal Technologies, 2013). For example, popular and scientific search engines like Google, Web-of-Science and PubMed give hundreds/thousands time more results when “plant” is search instead of “algae” and “microalgae” (figure1).

This project aims to create the tools needed for a citizen science project called “Big Algae open Experiment”. The experiment aims to enhance our knowledge of microalgae by inviting universities and citizen scientists to participate in an open-source data collection experiment on outdoor microalgal growth. During the experiment, certain algal species will be selected and systematically tested in several locations across the entire UK. The tools and activity that we are planning to use to execute the project are displayed in the figure 2.

The Team

Dr Paolo Bombelli,
Postdoctoral Researcher, Department of Biochemistry, University of Cambridge

Dr Brenda Parker,
Lecturer in Biochemical Engineering, University College London

Dr James Lawrence,
Teaching Fellow in Biochemical Engineering, University College London

Mr Marc Jones,
Graduate Student, John Innes Centre, Norwich


Project Outputs

Project Report

Summary of the project's achievements and future plans

Project Proposal

Original proposal and application

software.png

Project Outcomes

Project website and two blog posts about Latitude Festival hosted on UCL and OpenPlant websites


Big Algae Open Experiment: Education and Future Work

Summary

Algae are amazing: they recycle over half of the carbon dioxide we exhale, and form the basis of many food chains, yet we still understand very little about how they grow. In future, we may wish to cultivate algae for food, fuel, or to clean up wastewater so we need to understand more about their biology!

With this in mind, we have set up the Big Open Algae Experiment to help us enhance our knowledge by performing the biggest parallel algae experiment in history. We are inviting universities and citizen scientists to participate in an open-source data collection experiment on outdoor microalgal growth. Up and down the UK, we’ll be running experiments using a bioreactor we have designed and asking people to submit their recordings of how well the algae are growing. Following and recording the algal growth will be easy and fun. This thank to a smart-phone app: the Alg-app . The Alg-app will enable everyone having access to a smartphone to get involved.

Report and Outcomes and Follow-On Plans

1. The photobioreactor

Part of the project’s aim was to develop an open-source photobioreactor which could be used in the Big Algae Open Experiment. The main criteria for the reactor design were that it should be cheap to produce, robust and easy to assemble and use.

Design of the photobioreactor

To minimise the cost of manufacture, the photobioreactor has been designed so that it can be constructed from widely-available materials, primarily acrylic. The reactor base, lid, flange and other components (see figure 1A) have been designed so that they can be fabricated from 3mm acrylic sheets using a laser cutter. The tubes used for the riser and downcomer sections of the reactor are also made from acrylic, and can be ordered prefabricated from a UK supplier (theplasticshop.co.uk). The tubes used in the design have a wall thickness of 4mm and an outer diameter of 110mm and 150mm respectively, so that there is an equal cross-sectional area in the riser and downcomer sections of the reactor. This gives a good liquid circulation rate when the reactor is running. When assembled, the reactor has a total volume of approximately 6L.

The acrylic components of the reactor base and lid are bonded using plastic weld, which can be applied using a pipette or paintbrush. The reactor base is designed to accept an aquarium airstone to serve as a sparger – this is sealed in place with silicone sealant. The downcomer tube is reversibly attached to the base of the reactor using a gasket cast from silicone sealant, which is compressed and held by the four-part flange and twelve 70mm M5 screws. The screws also function as a reactor stand, lifting it enough that an air supply can be connected to the airstone. The assembled reactor is shown in figures 1B and C below.

Figure 1: The design of the airlift photobioreactor; (A) a CAD model of the exploded view of the reactor and its components; (B) a CAD model of the assembled reactor; (C) the reactor assembled in the laboratory with the cell counting calibration win…

Figure 1: The design of the airlift photobioreactor; (A) a CAD model of the exploded view of the reactor and its components; (B) a CAD model of the assembled reactor; (C) the reactor assembled in the laboratory with the cell counting calibration window and Arduino light meter in place.

Assembly and Use of the photobioreactor

When assembling the reactor for the first time, the components making up the lid and base of the reactor need to be bonded together with plastic weld, and the airstone should be set into the base using sealant, which must be left to set overnight before the reactor can be used. For the most part, the acrylic components can be aligned by eye, or by using the holes machined for the M5 screws. Plastic weld can then be applied to the joins between the parts to quickly create a strong bond.

Sealing the airstone in place is a critical step to ensure that there is no leaking from the reactor while it is in use. Occasional leakages have been observed when using the reactor to culture algae, all of which stemmed from the airstone port. If a leak does occur from the airstone port when the reactor is in use, the reactor must be stopped and taken apart to reseal the airstone. Any aquarium airstone with a standard 5mm connection can be used, but it was found that using stones with a flat bottom allows for easier sealing to the reactor base. Also, applying sealant externally around the protruding tubing connector of the airstone helped reduce the risk of leaking.

Once the acrylic parts have been bonded together and the sealant around the airstone is set, the next step is to cast a gasket for the base of the reactor. This is done by filling the recess in the reactor base with sealant and leaving it to set. The cured sealant can be removed from the recess, resulting in a gasket that can be fitted around the bottom of the downcomer to form a water-tight seal when compressed by the flange.

Once the components are ready, a single person can assemble the reactor in approximately 20 minutes. The only additional tools required are a spanner and screwdriver. The first step is to fit the downcomer and gasket in place in the base of the reactor, then to attach the flange components to compress the gasket and form a water-tight seal. When putting the four flange components in place it is important tighten the assembly screws around the base in a balanced fashion to apply an even pressure and prevent the downcomer tube from being inserted incorrectly. Once this is in place, the riser tube can be positioned on its supports in the reactor base, and the aquarium pump can be attached to the airstone via rubber tubing. The cell count calibration sticker can be applied to the outside of the downcomer tube and the Arduino-based light meter (if it is being used) can be started up.

After the reactor has been assembled and used successfully, it can be cleaned out with water and used again without having to dismantle and reassemble it. If a thorough clean is required, the base of the reactor can be removed and the parts can be soaked in bleach.

Arduino Light Meter

An Arduino-based light meter has also been designed for use with the photobioreactor, as a way of automatically including measurements of local light conditions in the data submitted to the Big Algae website. The meter reads the light level in lux using a BH1750 light sensor breakout board (designed by mysensors.org), converts the value to binary and displays this on an LCD panel as a series of illuminated segments. A schematic of the circuit is shown in figure 2A. The LCD also displays the amount of time that the reactor has been running for in the same format (see figure 2B).

The light meter is mounted on the side of the reactor, with the LCD positioned near the cell counting calibration window. When a photo of the reactor is submitted to the Big Algae website the image normalisation algorithm recognises the LCD, reads the position of the illuminated segments on the display (using the alignment strip to determine orientation) and uses this to determine the time and light level when the photo was taken. These values are attributed to the photo when it is stored.

Figure 2: Arduino-based meter for measuring ambient light; (A) the circuit comprising an Arduino Uno, LCD and BH1750 breakout board; (B) the LCD display showing the current light level and duration of the experiment in binary.

Figure 2: Arduino-based meter for measuring ambient light; (A) the circuit comprising an Arduino Uno, LCD and BH1750 breakout board; (B) the LCD display showing the current light level and duration of the experiment in binary.

Cost to produce the photobioreactor

Table 1: Cost of photobioreactor materials

Table 1: Cost of photobioreactor materials

The approximate total cost (excluding delivery costs) of components required to build the photobioreactor itself are shown below in table 1. We plan to create reactor ‘kits’ which could be sold to schools participating in the experiment at cost-price, allowing schools to participate even if they do not have access to the tools required to manufacture the parts themselves. These kits would contain all of the parts required to build the reactor, instructions for assembly and use, as well as a supply of media concentrate. To keep costs down the Arduino and components required for the light meter have not been included – this would suggested to participants as an optional extra.

We have identified a manufacturer that would be able to produce the laser-cut acrylic parts of the reactor in bulk, which would add around £7-8 to the cost of each reactor kit. This would bring the total cost of each kit to £60, which should be affordable to most schools.

Work to complete the photobioreactor

The photobioreactor design described fulfils most of the criteria set out. It can be manufactured at low cost, it is relatively easy to assemble and use and, if assembled carefully, it can be operated for long periods of time without leaking. There are still changes we would like to make to the design to simplify the assembly process, in particular by switching to a pre-fabricated gasket for sealing the reactor base, instead of casting one with silicone. It would also be useful to have a housing for the Arduino light meter, so that it can be mounted easily on the side of the reactor, near the calibration window. Like the light meter itself, this would not be part of the kits we distribute to schools, but an optional extra. Once these changes have been made we will release the designs on GitHub and thingiverse.

2. The smart-phone app (Alg-app)

The computational methods developed as part of the Big Algae Open Experiment consist of the website associated with the project and the image analysis pipeline. An explanation of the methods and some preliminary results from them are provided, along with some challenges faced during the project and an outlook for the directions the project could take in the future.

Website for interface with the Alg-app

To allow for data collection and information dissemination, a centralized online platform was created (http://bigalgae.com). Through this platform, general information about the Big Algae Open Experiment was made available and users can register bioreactors. Registration requires a team name, an e-mail address and the location of the bioreactor. Location information is collected to allow the environmental conditions experienced by bioreactors placed outdoors to be determined. After validating a user's email address upload and experiment validation codes are sent to the user. These codes are required to upload data and register new experiments for a particular reactor respectively. Having a form of validation such as this lowers the risk of abuse significantly, without negatively affecting the usability of the site. When uploading an image cell count data and optical density measurements may also be uploaded, with dry mass information being uploaded retrospectively. During the image upload process, the image enters the image analysis pipeline to determine whether it contains a calibration window and an Arduino controlled time display (Section 2 - Algorithm). Presence of the calibration window is essential for the upload to be successful, while the presence of the time display is not. The online platform was created using a Python web framework called Flask, hosted using DigitalOcean cloud infrastructure and made use of Ansible to automate server management tasks. The Google Maps and Google Places application programming interfaces were utilized to provide interactive maps on the site and to determine longitude and latitude information for each bioreactor.

Figure 3 demonstrates the effect the normalization algorithm has on a selection of six images. The pixels from the coloured squares in the reference image (Figure 1a) are used to train a Gaussian process model for each colour channel within each ima…

Figure 3 demonstrates the effect the normalization algorithm has on a selection of six images. The pixels from the coloured squares in the reference image (Figure 1a) are used to train a Gaussian process model for each colour channel within each image. Figures 1b-1g are the unnormalized images uploaded to the site after they have undergone image transformation to orient the anchor points. Figures 1h-1m are the corresponding normalized images after the Gaussian process based algorithm has been applied.

Image analysis pipeline

The image analysis pipeline has been developed in Python using the OpenCV computer vision library [1] and consists of calibration window detection and image normalization.

Detection of the calibration window

Images are converted to greyscale and have a Gaussian blur applied to them before they are subjected to a threshold. The threshold converts images into black and white images, to which a contour finding algorithm is applied. Contours are arranged into a tree like structure, with a contour's parent being the contour which contains it. In order to find the three anchor points of the calibration window the algorithm iterates over the list of contours and finds contours whose areas are in the correct ratios. The ratios which the algorithm looks for are a 9:16 area ratio between a contour and its parent and a 9:25 area ratio between a contour and its grandparent. Once three anchor points are found, the image is transformed to orient the picture correctly and correct any skewing of the image.

Image normalization

Using the positions of the anchor points, the coloured squares and the transparent window are located in the image and the pixel information for them extracted. The pixel information consists of red, green and blue channel values, which take integer values between 0 and 255 inclusive and well as positional information. To allow the algorithm to run in reasonable time, the number of pixels is downsampled. To normalize the colours, a Gaussian process model is applied to each colour channel (red, green and blue) separately. Gaussian processes are a probabilistic framework used to model unknown functions, and are used in this study because of the unknown non-linearities involved when normalizing photos taken using different equipment and lighting conditions. They are implemented in the algorithm using the GPy Python library [2]. To parameterize the Gaussian process models, pixel information from the coloured squares and from the black squares within the anchor points are used. Positional as well as RGB values of the pixels are combined to ensure that variation due to the camera and lighting can be taken into account. The Gaussian process models are constructed using a linear kernel combined with a squared exponential kernel. This allows the model to capture the general linear trend while also allowing for non-linearities to be considered by the model. The pixel information from an image as well as from a reference image of the calibration window are used to train the models for each colour channel. Once parameterized, the Gaussian process models are used to normalize the pixel information from the transparent section of the calibration window, which corresponds to the colour of the algae in the bioreactor. Each pixel is normalized by inputting its position in the image as well as its unnormalized RGB values into each Gaussian process model. The output from the Gaussian process models are mean and variance values for each colour channel.

Figure 4 shows the relationship between colour intensity values before and after the normalization algorithm has been applied for each colour channel separately. The two rows of the figure correspond to the results of the normalization algorithm whe…

Figure 4 shows the relationship between colour intensity values before and after the normalization algorithm has been applied for each colour channel separately. The two rows of the figure correspond to the results of the normalization algorithm when sampling 100 pixels from the images and when sampling 500 pixels from the images. The relationships between the pre- and post-normalization values all show non-linearities, validating the application of Gaussian processes to this colour normalization problem. Also striking is the similarities between running the algorithm using 100 pixels to train the Gaussian processes and sample and using 500 pixels. This result implies that downsampling the image to as low as 100 pixels is still capable of capturing the variability in the pixel colour intensities.

Challenges and Outlook

Website plotting features using D3

To facilitate the use of the website to track algal growth over time, options to graph the data will be implemented. These graphing options will make use of D3, a JavaScript visualization library. The advantage of client side visualization is reduced server side computation as well as responsive, dynamic plots. The plot will allow the user to plot different measures of algal density (optical density measurements, dry mass and cell count) against time. The time measurement will be determined in three ways, with the order of precedence determined by their differing levels of accuracy. The most accurate method of time elapsed since the start of the experiment is the Arduino based time. If the image analysis pipeline does not detect a time window in the photo, and EXIF data is available, the date and time the photo was taken is taken from the photo's metadata. The disadvantage of using the metadata is that the internal date and time set on the phone may be incorrect or set to a different time zone. The final time used is the time the image was uploaded to the server. The disadvantage of using this time is that users may take photos are upload them at a later time. This may be the case if Internet access is intermittent or is not possible on the device the user is using, such as when a digital camera is used to capture the images.

Figure 5 demonstrates the effect of the normalization algorithm on the mean measurements of colour intensity for repeated measurements. Two bioreactors were setup and photographed on two separate dates. Assuming the algal density did not change sign…

Figure 5 demonstrates the effect of the normalization algorithm on the mean measurements of colour intensity for repeated measurements. Two bioreactors were setup and photographed on two separate dates. Assuming the algal density did not change significantly throughout the time the bioreactors were photographed, the images taken can be seen as repeated measurements. For each repeat measurement, the mean colour intensity for the algal window was calculated before and after the normalization algorithm was applied. The spread of these mean values after the normalization algorithm was applied increased in some cases, such as the blue and green channels in the 'Experiment: 2 Date: 2016-02-13' sample. This trend was not consistent however, with a decrease in the interquartile range observed in all colour channels post-normalization in the 'Experiment 2: Date: 2016-02-20' sample.

Pixel sampling in the normalization algorithm

The Gaussian process image normalization process requires many matrix computations to be carried out. The size of the matricies being used in the computations is directly related to both the number of colour square pixels being used to train the Gaussian process and the number of pixels sampled from the algal window. Although a more accurate estimation of the RGB colour value of the algal window would be obtained by using more pixels to train the GP model and by sampling more pixels from the algal window, this has to be balanced with the computational run time of the image analysis pipeline. The computational run time of the image analysis pipeline is pivotal to the final user experience of the website. If the computational run time is exceeds when the user uploads an image, the website may feel unresponsive. Conversely, if the image analysis pipeline was run in a batch manner, the data points from uploaded images would only become visible on the growth graph when the pipeline is run. This would have the effect of reducing the gratification the user receives from contributing to the project. To counter these problems, a two pass image analysis pipeline could be implemented. The first pass of the pipeline would be executed when the image was uploaded, and would the number of pixels sampled during this pass would be low to reduce the computational run time. The pipeline could then execute the second pass during times when the website is less active. The second pass would consist of the image analysis algorithm being run using more pixels in the training and prediction steps than the first pass. Because a user does not wait for the response of the second pass, the computational run time of the second pass does not impact the user experience of the website. Reassuringly, using 100 sampled pixels does not result in a marked difference in the final results obtained using 500 sampled pixels (Figure 2 and 3). Despite this observation, the use of more sampled pixels does allow the variation present in each image to be better estimated and reduces the interquartile range observed between repeated measurements post-normalization (Figure 3). The improvement in the estimations validates the use of a two pass image analysis approach in the future.

Prediction of algal density

Currently the image analysis pipeline just checks for the presence or absence of the calibration window when an image is uploaded, and does not estimate algal density measures. During the project, a number of algal growth experiments were carried out to record algal growth using accurate quantification measures. The measures used were cell concentration (g / l), cell density (cells / ml) and optical density measurements carried out at wavelengths of 680nm and 750nm. Also recorded were images of the bioreactor with a calibration strip fitted.

The relationship between the accurate quantification measures and the colour intensities as predicted by the normalization algorithm show non linear trends in all cases. The trends seem to follow inverse relationships, whereby the lower the colour intenity the higher the measure of algal density. This makes intuitive sense given that as the algal reaction mixtures becomes denser less of the white background is observed. This trend also suggests that estimating algal density using digital images of the reaction mixture will be most sensitive within a certain range of density values. Currently, the number of data points collected is not sufficient to form training and testing datasets to validate any prediction algorithm. Once more training data has been obtained, however, the image will be normalized, a prediction of the algal density within the bioreactor will be made and the result will be sent to the user within a few seconds. While currently invalidated by experimental data, the method of algal density prediction carried out by the image analysis pipeline will consist of a GP regression model. In order to relate the colour of the algal reaction mixture to algal density, the distribution of values for each colour channel needs to be known. For each pixel normalized using the Gaussian process method detailed above, a normalized mean value and a variance value are given as output for each colour channel. Finding the algal reaction mixture colour values for each channel by averaging the normalized pixel mean colour values would not incorporate the variation observed for each pixel. Therefore, to incorporate the variation, the per pixel probability distributions are combined and points are sampled from the combined probability distribution. An overall mean and variance for the algal reaction mixture for each colour channel are then calculated from the sampled values. The colour values for the algal reaction mixture are used as inputs to another Gaussian process model to allow for prediction of the algal density measurements (Figure 4).

Figure 6 Relationships between measures of algal density and normalized colour intensities

Figure 6 Relationships between measures of algal density and normalized colour intensities

Each row of the figure corresponds to a different colour channel, while each column corresponds to a different measure of algal density. Each data point represents a single measurement of algal density and a single image. The colour intensity measurement is normalized using the Gaussian Process method described in the main text. The error bars on the data points incorporate both the image variability and the uncertainty in the normalized value, as estimated by the Gaussian Process based algorithm. This is aggregated data from multiple calibration experiments carried out over a period of four months

3. Logo and graphical appearance

One of our aims was to involve designers and other visual artists to enable us to improve our communications. We have been working with Laura Gordon, a graphic designer, to develop our visual appearance on educational materials and branding (Fig 7).

Figure 7: Designs created by Laura Gordon, for logos and visual identity.

Figure 7: Designs created by Laura Gordon, for logos and visual identity.

The Big Open Algae Experiment events

The PBR and the Alg-app has been already presented and tested during two weekend labwork sessions, as part of “Co-Lab”, an interactive workshop for designers and scientists, held in conjunction with the Institute of Making at UCL. We introduced the concept behind the bioreactor, and then invited participants to take OD readings and then use the Alg-app. This was used as an informal opportunity to get feedback from an audience new to algae, and who had not previously had experience in the lab. Data gathered over the two weekends was used to test the upload procedure and help to validate the algorithm.

We are now planning to test our PBR and the Alg-app at the Widening Participation and to make them available as educational tools through the Engineering Education Website. Our timetable for testing the PBR and the Alg-app as educational tools in the school spans from May to July.

Widening Participation

We have been in contact with Widening Participation (WP) at UCL, about how we can use the Big Algae Open Experiment in collaboration with their efforts to encourage underrepresented groups to consider higher education. From this meeting we had two suggestions:

Include a page on the website where students can find out more about our disciplines and what we study

Use consent forms that enable WP departments in our respective universities to track the outcomes of high school students who have participated in the project. This could be useful for us long term if we run this over a number of years as we can see if there is an impact e.g. more students choosing biology at A-level / going on to pick degree courses related to synbio, plant sciences or biochemical engineering.

Website development to support educational offering

Through contact with the team at Engineering Education, we have two suggestions based on their experience on the “BristleBot” project where they had supplied a kit to schools and a PDRA had visited the class to help with set up. They stressed the importance of reliability and ease of set up, as teachers often had limited time to spend on troubleshooting before lessons. Two suggestions were made to minimise problems on the user side:

Incorporation of an FAQ section for participants, listing solutions to common issues or queries regarding the set up or uploading of data

Inclusion of a “forum” whereby users can communicate with the organising team and get support or report observations.

References

Bradski, G., OpenCV Library, Dr. Dobb's Journal of Software Tools, 2000.

The GPy authors, GPy: A Gaussian process framework in Python, 2012-2015.