The Big Algal Open Experiment
We set up the Big Open Algae Experiment to help us enhance our knowledge by performing the biggest parallel algae experiment in history. We are inviting universities and citizen scientists to participate in an open-source data collection experiment on outdoor microalgal growth.
The Idea
Microalgae are tremendously diverse, inhabiting almost every biome of the planet. Estimates suggest that there are over 50,000 microalgal species1, representing a rich resource for industrial biotechnology and an immense library of biosynthetic components which can be unlocked with synthetic biology.
Despite this promise, the potential of these “tiny plants” remains unrealised due to a lack of knowledge on how to predictably cultivate microalgae and benefit from this biodiversity. While algae could play a role as a crop of the future, this needs to be enabled by building up a mass of data and know-how (UK Roadmap for Algal Technologies, 2013). For example, popular and scientific search engines like Google, Web-of-Science and PubMed give hundreds/thousands time more results when “plant” is search instead of “algae” and “microalgae” (figure1).
This project aims to create the tools needed for a citizen science project called “Big Algae open Experiment”. The experiment aims to enhance our knowledge of microalgae by inviting universities and citizen scientists to participate in an open-source data collection experiment on outdoor microalgal growth. During the experiment, certain algal species will be selected and systematically tested in several locations across the entire UK. The tools and activity that we are planning to use to execute the project are displayed in the figure 2.
The Team
Dr Paolo Bombelli,
Postdoctoral Researcher, Department of Biochemistry, University of Cambridge
Dr Brenda Parker,
Lecturer in Biochemical Engineering, University College London
Dr James Lawrence,
Teaching Fellow in Biochemical Engineering, University College London
Mr Marc Jones,
Graduate Student, John Innes Centre, Norwich
Project Outputs
Project Report
Summary of the project's achievements and future plans
Project Proposal
Original proposal and application
Big Algae Open Experiment: Education and Future Work
Summary
Algae are amazing: they recycle over half of the carbon dioxide we exhale, and form the basis of many food chains, yet we still understand very little about how they grow. In future, we may wish to cultivate algae for food, fuel, or to clean up wastewater so we need to understand more about their biology!
With this in mind, we have set up the Big Open Algae Experiment to help us enhance our knowledge by performing the biggest parallel algae experiment in history. We are inviting universities and citizen scientists to participate in an open-source data collection experiment on outdoor microalgal growth. Up and down the UK, we’ll be running experiments using a bioreactor we have designed and asking people to submit their recordings of how well the algae are growing. Following and recording the algal growth will be easy and fun. This thank to a smart-phone app: the Alg-app . The Alg-app will enable everyone having access to a smartphone to get involved.
Report and Outcomes and Follow-On Plans
1. The photobioreactor
Part of the project’s aim was to develop an open-source photobioreactor which could be used in the Big Algae Open Experiment. The main criteria for the reactor design were that it should be cheap to produce, robust and easy to assemble and use.
Design of the photobioreactor
To minimise the cost of manufacture, the photobioreactor has been designed so that it can be constructed from widely-available materials, primarily acrylic. The reactor base, lid, flange and other components (see figure 1A) have been designed so that they can be fabricated from 3mm acrylic sheets using a laser cutter. The tubes used for the riser and downcomer sections of the reactor are also made from acrylic, and can be ordered prefabricated from a UK supplier (theplasticshop.co.uk). The tubes used in the design have a wall thickness of 4mm and an outer diameter of 110mm and 150mm respectively, so that there is an equal cross-sectional area in the riser and downcomer sections of the reactor. This gives a good liquid circulation rate when the reactor is running. When assembled, the reactor has a total volume of approximately 6L.
The acrylic components of the reactor base and lid are bonded using plastic weld, which can be applied using a pipette or paintbrush. The reactor base is designed to accept an aquarium airstone to serve as a sparger – this is sealed in place with silicone sealant. The downcomer tube is reversibly attached to the base of the reactor using a gasket cast from silicone sealant, which is compressed and held by the four-part flange and twelve 70mm M5 screws. The screws also function as a reactor stand, lifting it enough that an air supply can be connected to the airstone. The assembled reactor is shown in figures 1B and C below.
Assembly and Use of the photobioreactor
When assembling the reactor for the first time, the components making up the lid and base of the reactor need to be bonded together with plastic weld, and the airstone should be set into the base using sealant, which must be left to set overnight before the reactor can be used. For the most part, the acrylic components can be aligned by eye, or by using the holes machined for the M5 screws. Plastic weld can then be applied to the joins between the parts to quickly create a strong bond.
Sealing the airstone in place is a critical step to ensure that there is no leaking from the reactor while it is in use. Occasional leakages have been observed when using the reactor to culture algae, all of which stemmed from the airstone port. If a leak does occur from the airstone port when the reactor is in use, the reactor must be stopped and taken apart to reseal the airstone. Any aquarium airstone with a standard 5mm connection can be used, but it was found that using stones with a flat bottom allows for easier sealing to the reactor base. Also, applying sealant externally around the protruding tubing connector of the airstone helped reduce the risk of leaking.
Once the acrylic parts have been bonded together and the sealant around the airstone is set, the next step is to cast a gasket for the base of the reactor. This is done by filling the recess in the reactor base with sealant and leaving it to set. The cured sealant can be removed from the recess, resulting in a gasket that can be fitted around the bottom of the downcomer to form a water-tight seal when compressed by the flange.
Once the components are ready, a single person can assemble the reactor in approximately 20 minutes. The only additional tools required are a spanner and screwdriver. The first step is to fit the downcomer and gasket in place in the base of the reactor, then to attach the flange components to compress the gasket and form a water-tight seal. When putting the four flange components in place it is important tighten the assembly screws around the base in a balanced fashion to apply an even pressure and prevent the downcomer tube from being inserted incorrectly. Once this is in place, the riser tube can be positioned on its supports in the reactor base, and the aquarium pump can be attached to the airstone via rubber tubing. The cell count calibration sticker can be applied to the outside of the downcomer tube and the Arduino-based light meter (if it is being used) can be started up.
After the reactor has been assembled and used successfully, it can be cleaned out with water and used again without having to dismantle and reassemble it. If a thorough clean is required, the base of the reactor can be removed and the parts can be soaked in bleach.
Arduino Light Meter
An Arduino-based light meter has also been designed for use with the photobioreactor, as a way of automatically including measurements of local light conditions in the data submitted to the Big Algae website. The meter reads the light level in lux using a BH1750 light sensor breakout board (designed by mysensors.org), converts the value to binary and displays this on an LCD panel as a series of illuminated segments. A schematic of the circuit is shown in figure 2A. The LCD also displays the amount of time that the reactor has been running for in the same format (see figure 2B).
The light meter is mounted on the side of the reactor, with the LCD positioned near the cell counting calibration window. When a photo of the reactor is submitted to the Big Algae website the image normalisation algorithm recognises the LCD, reads the position of the illuminated segments on the display (using the alignment strip to determine orientation) and uses this to determine the time and light level when the photo was taken. These values are attributed to the photo when it is stored.
Cost to produce the photobioreactor
The approximate total cost (excluding delivery costs) of components required to build the photobioreactor itself are shown below in table 1. We plan to create reactor ‘kits’ which could be sold to schools participating in the experiment at cost-price, allowing schools to participate even if they do not have access to the tools required to manufacture the parts themselves. These kits would contain all of the parts required to build the reactor, instructions for assembly and use, as well as a supply of media concentrate. To keep costs down the Arduino and components required for the light meter have not been included – this would suggested to participants as an optional extra.
We have identified a manufacturer that would be able to produce the laser-cut acrylic parts of the reactor in bulk, which would add around £7-8 to the cost of each reactor kit. This would bring the total cost of each kit to £60, which should be affordable to most schools.
Work to complete the photobioreactor
The photobioreactor design described fulfils most of the criteria set out. It can be manufactured at low cost, it is relatively easy to assemble and use and, if assembled carefully, it can be operated for long periods of time without leaking. There are still changes we would like to make to the design to simplify the assembly process, in particular by switching to a pre-fabricated gasket for sealing the reactor base, instead of casting one with silicone. It would also be useful to have a housing for the Arduino light meter, so that it can be mounted easily on the side of the reactor, near the calibration window. Like the light meter itself, this would not be part of the kits we distribute to schools, but an optional extra. Once these changes have been made we will release the designs on GitHub and thingiverse.
2. The smart-phone app (Alg-app)
The computational methods developed as part of the Big Algae Open Experiment consist of the website associated with the project and the image analysis pipeline. An explanation of the methods and some preliminary results from them are provided, along with some challenges faced during the project and an outlook for the directions the project could take in the future.
Website for interface with the Alg-app
To allow for data collection and information dissemination, a centralized online platform was created (http://bigalgae.com). Through this platform, general information about the Big Algae Open Experiment was made available and users can register bioreactors. Registration requires a team name, an e-mail address and the location of the bioreactor. Location information is collected to allow the environmental conditions experienced by bioreactors placed outdoors to be determined. After validating a user's email address upload and experiment validation codes are sent to the user. These codes are required to upload data and register new experiments for a particular reactor respectively. Having a form of validation such as this lowers the risk of abuse significantly, without negatively affecting the usability of the site. When uploading an image cell count data and optical density measurements may also be uploaded, with dry mass information being uploaded retrospectively. During the image upload process, the image enters the image analysis pipeline to determine whether it contains a calibration window and an Arduino controlled time display (Section 2 - Algorithm). Presence of the calibration window is essential for the upload to be successful, while the presence of the time display is not. The online platform was created using a Python web framework called Flask, hosted using DigitalOcean cloud infrastructure and made use of Ansible to automate server management tasks. The Google Maps and Google Places application programming interfaces were utilized to provide interactive maps on the site and to determine longitude and latitude information for each bioreactor.
Image analysis pipeline
The image analysis pipeline has been developed in Python using the OpenCV computer vision library [1] and consists of calibration window detection and image normalization.
Detection of the calibration window
Images are converted to greyscale and have a Gaussian blur applied to them before they are subjected to a threshold. The threshold converts images into black and white images, to which a contour finding algorithm is applied. Contours are arranged into a tree like structure, with a contour's parent being the contour which contains it. In order to find the three anchor points of the calibration window the algorithm iterates over the list of contours and finds contours whose areas are in the correct ratios. The ratios which the algorithm looks for are a 9:16 area ratio between a contour and its parent and a 9:25 area ratio between a contour and its grandparent. Once three anchor points are found, the image is transformed to orient the picture correctly and correct any skewing of the image.
Image normalization
Using the positions of the anchor points, the coloured squares and the transparent window are located in the image and the pixel information for them extracted. The pixel information consists of red, green and blue channel values, which take integer values between 0 and 255 inclusive and well as positional information. To allow the algorithm to run in reasonable time, the number of pixels is downsampled. To normalize the colours, a Gaussian process model is applied to each colour channel (red, green and blue) separately. Gaussian processes are a probabilistic framework used to model unknown functions, and are used in this study because of the unknown non-linearities involved when normalizing photos taken using different equipment and lighting conditions. They are implemented in the algorithm using the GPy Python library [2]. To parameterize the Gaussian process models, pixel information from the coloured squares and from the black squares within the anchor points are used. Positional as well as RGB values of the pixels are combined to ensure that variation due to the camera and lighting can be taken into account. The Gaussian process models are constructed using a linear kernel combined with a squared exponential kernel. This allows the model to capture the general linear trend while also allowing for non-linearities to be considered by the model. The pixel information from an image as well as from a reference image of the calibration window are used to train the models for each colour channel. Once parameterized, the Gaussian process models are used to normalize the pixel information from the transparent section of the calibration window, which corresponds to the colour of the algae in the bioreactor. Each pixel is normalized by inputting its position in the image as well as its unnormalized RGB values into each Gaussian process model. The output from the Gaussian process models are mean and variance values for each colour channel.
Challenges and Outlook
Website plotting features using D3
To facilitate the use of the website to track algal growth over time, options to graph the data will be implemented. These graphing options will make use of D3, a JavaScript visualization library. The advantage of client side visualization is reduced server side computation as well as responsive, dynamic plots. The plot will allow the user to plot different measures of algal density (optical density measurements, dry mass and cell count) against time. The time measurement will be determined in three ways, with the order of precedence determined by their differing levels of accuracy. The most accurate method of time elapsed since the start of the experiment is the Arduino based time. If the image analysis pipeline does not detect a time window in the photo, and EXIF data is available, the date and time the photo was taken is taken from the photo's metadata. The disadvantage of using the metadata is that the internal date and time set on the phone may be incorrect or set to a different time zone. The final time used is the time the image was uploaded to the server. The disadvantage of using this time is that users may take photos are upload them at a later time. This may be the case if Internet access is intermittent or is not possible on the device the user is using, such as when a digital camera is used to capture the images.
Pixel sampling in the normalization algorithm
The Gaussian process image normalization process requires many matrix computations to be carried out. The size of the matricies being used in the computations is directly related to both the number of colour square pixels being used to train the Gaussian process and the number of pixels sampled from the algal window. Although a more accurate estimation of the RGB colour value of the algal window would be obtained by using more pixels to train the GP model and by sampling more pixels from the algal window, this has to be balanced with the computational run time of the image analysis pipeline. The computational run time of the image analysis pipeline is pivotal to the final user experience of the website. If the computational run time is exceeds when the user uploads an image, the website may feel unresponsive. Conversely, if the image analysis pipeline was run in a batch manner, the data points from uploaded images would only become visible on the growth graph when the pipeline is run. This would have the effect of reducing the gratification the user receives from contributing to the project. To counter these problems, a two pass image analysis pipeline could be implemented. The first pass of the pipeline would be executed when the image was uploaded, and would the number of pixels sampled during this pass would be low to reduce the computational run time. The pipeline could then execute the second pass during times when the website is less active. The second pass would consist of the image analysis algorithm being run using more pixels in the training and prediction steps than the first pass. Because a user does not wait for the response of the second pass, the computational run time of the second pass does not impact the user experience of the website. Reassuringly, using 100 sampled pixels does not result in a marked difference in the final results obtained using 500 sampled pixels (Figure 2 and 3). Despite this observation, the use of more sampled pixels does allow the variation present in each image to be better estimated and reduces the interquartile range observed between repeated measurements post-normalization (Figure 3). The improvement in the estimations validates the use of a two pass image analysis approach in the future.
Prediction of algal density
Currently the image analysis pipeline just checks for the presence or absence of the calibration window when an image is uploaded, and does not estimate algal density measures. During the project, a number of algal growth experiments were carried out to record algal growth using accurate quantification measures. The measures used were cell concentration (g / l), cell density (cells / ml) and optical density measurements carried out at wavelengths of 680nm and 750nm. Also recorded were images of the bioreactor with a calibration strip fitted.
The relationship between the accurate quantification measures and the colour intensities as predicted by the normalization algorithm show non linear trends in all cases. The trends seem to follow inverse relationships, whereby the lower the colour intenity the higher the measure of algal density. This makes intuitive sense given that as the algal reaction mixtures becomes denser less of the white background is observed. This trend also suggests that estimating algal density using digital images of the reaction mixture will be most sensitive within a certain range of density values. Currently, the number of data points collected is not sufficient to form training and testing datasets to validate any prediction algorithm. Once more training data has been obtained, however, the image will be normalized, a prediction of the algal density within the bioreactor will be made and the result will be sent to the user within a few seconds. While currently invalidated by experimental data, the method of algal density prediction carried out by the image analysis pipeline will consist of a GP regression model. In order to relate the colour of the algal reaction mixture to algal density, the distribution of values for each colour channel needs to be known. For each pixel normalized using the Gaussian process method detailed above, a normalized mean value and a variance value are given as output for each colour channel. Finding the algal reaction mixture colour values for each channel by averaging the normalized pixel mean colour values would not incorporate the variation observed for each pixel. Therefore, to incorporate the variation, the per pixel probability distributions are combined and points are sampled from the combined probability distribution. An overall mean and variance for the algal reaction mixture for each colour channel are then calculated from the sampled values. The colour values for the algal reaction mixture are used as inputs to another Gaussian process model to allow for prediction of the algal density measurements (Figure 4).
Each row of the figure corresponds to a different colour channel, while each column corresponds to a different measure of algal density. Each data point represents a single measurement of algal density and a single image. The colour intensity measurement is normalized using the Gaussian Process method described in the main text. The error bars on the data points incorporate both the image variability and the uncertainty in the normalized value, as estimated by the Gaussian Process based algorithm. This is aggregated data from multiple calibration experiments carried out over a period of four months
3. Logo and graphical appearance
One of our aims was to involve designers and other visual artists to enable us to improve our communications. We have been working with Laura Gordon, a graphic designer, to develop our visual appearance on educational materials and branding (Fig 7).
The Big Open Algae Experiment events
The PBR and the Alg-app has been already presented and tested during two weekend labwork sessions, as part of “Co-Lab”, an interactive workshop for designers and scientists, held in conjunction with the Institute of Making at UCL. We introduced the concept behind the bioreactor, and then invited participants to take OD readings and then use the Alg-app. This was used as an informal opportunity to get feedback from an audience new to algae, and who had not previously had experience in the lab. Data gathered over the two weekends was used to test the upload procedure and help to validate the algorithm.
We are now planning to test our PBR and the Alg-app at the Widening Participation and to make them available as educational tools through the Engineering Education Website. Our timetable for testing the PBR and the Alg-app as educational tools in the school spans from May to July.
Widening Participation
We have been in contact with Widening Participation (WP) at UCL, about how we can use the Big Algae Open Experiment in collaboration with their efforts to encourage underrepresented groups to consider higher education. From this meeting we had two suggestions:
Include a page on the website where students can find out more about our disciplines and what we study
Use consent forms that enable WP departments in our respective universities to track the outcomes of high school students who have participated in the project. This could be useful for us long term if we run this over a number of years as we can see if there is an impact e.g. more students choosing biology at A-level / going on to pick degree courses related to synbio, plant sciences or biochemical engineering.
Website development to support educational offering
Through contact with the team at Engineering Education, we have two suggestions based on their experience on the “BristleBot” project where they had supplied a kit to schools and a PDRA had visited the class to help with set up. They stressed the importance of reliability and ease of set up, as teachers often had limited time to spend on troubleshooting before lessons. Two suggestions were made to minimise problems on the user side:
Incorporation of an FAQ section for participants, listing solutions to common issues or queries regarding the set up or uploading of data
Inclusion of a “forum” whereby users can communicate with the organising team and get support or report observations.
References
Bradski, G., OpenCV Library, Dr. Dobb's Journal of Software Tools, 2000.
The GPy authors, GPy: A Gaussian process framework in Python, 2012-2015.