Big Data Storage and Access Issues for Phenotyping of Agricultural Data
Plant phenotyping involves the assessment of plant traits such as growth, tolerance, resistance, and yield. The Texas Tech Phenotyping Project is specifically studying the cross-breed of cotton plants that will better survive the harsh climate of West Texas. Using robotics, images of individual plants in a field are being collected and analyzed over time to support the study, generating massive amounts of plant data. This research project is investigating the big data storage and organizational issues for phenotyping data. A conceptual design of the phenotyping data requirements has been generated to illustrate the large scope of the data required. NoSQL database technology has also been investigated as an alternative to relational databases to provide more efficient storage and retrieval. In particular, the utilization of the NoSQL-based Couchbase system has been investigated for its high scalability and cost effective storage of massive data. Temporal data management with respect to NoSQL databases has also been explored due to the time-oriented nature of phenotyping data collection and analysis . This research provides a prototype implementation of image data storage using CouchBase, together with examples of temporal queries and a performance analysis.