We are glad to announce the completion of Master’s thesis made by Santeri Levanto. Santeri studied applicability of a machine learning algorithm “Random Forest” for viscose quality characterisation, and the results are promising. The study opens opportunities for us at GloCell to broaden our scope and to take on new challenges in the field of machine learning. The thesis is available for everyone to read at Aalto University’s online archives. Following is a short abstract of Santeri’s work.
Modelling the quality characteristics of viscose production has not been successful with traditional regression methods such as linear regression. Due to advancements in the field of artificial intelligence and machine learning, a non-linear approach with algorithms from said fields is possible. No reports of using machine learning for viscose quality characterisation were found, which lead to the interest of studying its applicability. At the early stages of the study the algorithms were narrowed down to Random Forest, which had the most potential for this specific case.
Data utilized for the study consisted of Treiber’s test results from a time span of over ten years, and it involved data from multiple manufacturers. The data as whole could not be considered consistent, but manufacturer-specifically created models sparked promising results. Random Forest algorithm has a clear potential to model the quality behaviour, but data quality and quantity are very important factors.
The Random Forest model constructed in the study can predict with 95% confidence if the viscose quality classifies as good or bad, but the numerical prediction for the quality parameter has a large error margin for the 95% confidence. It is suggested, that the error margin could be lower, if the data quality was consistent and the number of data points was larger. In contrast, this study utilized roughly 400 data points whereas a suggested number of data points for Random Forest is above a thousand.
The study can be considered as a proof of concept for the
Random Forest applicability. Future progress involves utilizing more and better
data, and analyse, how it affects the model accuracy. If the viscose is good or
bad is in no interest in industrial scale, where the product is by default
always good. A model that could predict small nuances would be a deal breaker.
And it can be achieved.
Upon graduation Santeri’s work status was also renewed from part-time to full-time employment. Welcome to the GloCell team Santeri!