Editorial Feature

Developing an AI Algorithm to Monitor Infant Sleep Positions

Download PDF Copy

By Megan Craig, M.Sc.Dec 16 2021

Internet of Things (IoT) and Artificial Intelligence (AI) technologies continue to pave the way for innovation in a myriad of industries. Their integration into smart buildings has received increased interest and support over the years, especially as the demand for solutions to mitigate rising energy costs continues.

Image Credit: Feng Yu/Shutterstock.com

AI algorithms have also been installed to analyze the visual contents of surveillance cameras to determine the number of residents and their positions in smart buildings.

Sudden Infant Death Syndrome (SIDS) is one of the top causes of the sudden and unexpected death of babies under the age of one year in the United States.

Many studies show that allowing babies to sleep on their stomachs can increase SIDS risk. This position is considered the most dangerous as it compresses the baby's chin, narrows the airway and inhibits breathing.

In practice, it is difficult to keep babies sleeping on their backs because it is so easy for them to roll over and sleep on their stomachs. A series of sensor-based wearable or touchable monitors have been developed to monitor sleeping status.

When placed on babies, these sensors track parameters such as body temperature, breathing, and heart rate of sleeping babies. If the monitor's sensors detect abnormal activity a warning alert is sent.

To address the inadequacies of wearable sensor-based baby monitors, researchers have begun looking into contactless camera-based monitoring, which identifies sleeping positions using cameras and AI algorithms.

Instead of employing wearable body sensors or electrodes to collect signals, AI algorithms evaluate camera output and categorize sleeping postures. When compared to sensor-based baby monitors, this strategy is more user-friendly and cost-effective.

Presently, two key obstacles prohibit AI from recognizing possible infant sleep risks in smart buildings: (1) present datasets of baby sleep posture are not vast or diversified, and (2) a major barrier for utilizing deep learning AI algorithms in edge computing systems is storage constraints (See Table 1).

Table 1. Summary of Existing AI Algorithms for Contactless Camera-Based Sleep Posture Detection. Source: Huang, et al., 2021

Existing Work	Year	AI Algorithm	Accuracy	Disadvantages
[16]	2016	CNN with 3 convolution and 2 dense layers	94%	1. Expensive due to using a combination of depth sensors and infrared camera; 2. A small dataset (1880 samples) 3. Does not consider minimizing memory footprint
[17]	2020	CNN with 4 convolution and 2 dense layers	88%	1. A small dataset (4250 samples) 2. Diversity issue: all samples are daytime images 3. A large memory footprint (275 MB)
[18]	2021	ResNet with 16 convolution and 3 dense layers	89%	1. A small dataset (4250 samples) 2. Diversity issue: all samples are daytime images 3. A large memory footprint (147 MB)
[19]	2021	DenseNet-121	N/A	1. A small dataset with baby doll images 2. Lack of real baby sleep images 3. A large memory footprint (58.2 MB)
[21]	2021	Inception-V3	90%	1. A small dataset (1200 samples) 2. Lack of real baby sleep images 3. A large memory footprint (175.7 MB)
This work	2021	CNN with post-training weight quantization	90%	1. Post-training weight quantization may cause a slight decrease in accuracy

To address research constraints, a team of researchers has developed an optimal AI algorithm for infant sleep posture classification. Their results were published in the journal AI in December 2021.

Methodology

Convolutional Neural Network (CNN) was chosen for the current study because it is the most basic neural network design for transforming input photos into output categorization results.

CNN has been used to distinguish heavy construction equipment, safety hardhats, and tracking infant movement (see Figure 1) and is made up of stacked layers.

Input images were set as infant sleeping photographs with a dimension of 256 × 256 × 3, with a width and height of 256 pixels, and three RGB color channels.

Proposed AI algorithm for infant sleep posture classification.

Figure 1. Proposed AI algorithm for infant sleep posture classification. Image Credit: Huang, et al., 2021

Table 2 provides more detail on the model layer, number of filters, type, number of parameters, and output shape.

A post-training weight quantization was applied to the pre-trained AI algorithms to help the proposed CNN perform effortlessly in memory-constrained edge systems.

Table 2. Summary of existing AI algorithms for contactless camera-based sleep posture detection. Source: Huang, et al., 2021

Layer Name	Layer Type	Number of Filters	Output Shape	Number of Parameters
Conv2d	Conv2D	16	(256, 256, 16)	448
Max_pooling2d	MaxPooling2D		(128, 128, 16)	0
Conv2d_1	Conv2D	32	(128, 128, 32)	4640
Max_pooling2d_1	MaxPooling2D		(64, 64, 32)	0
Batch_normalization	Batch Normalization		(64, 64, 32)	128
Conv2d_2	Conv2D	64	(64, 64, 64)	18,496
Max_pooling2d_2	MaxPooling2D		(32, 32, 64)	0
Conv2d_3	Conv2D	64	(32, 32, 64)	36,928
Max_pooling2d_3	Maxpooling2D		(16, 16, 64)	0
Batch_normalization	Batch Normalization		(16, 16, 64)	256
Flatten	Flatten		16,384	0
Dense	Dense		384	6,291,840
Dropout	Dropout		384	0
Dense_1	Dense		128	49,280
Dense_2	Dense		1	129

Weight quantization has two advantages: it reduces memory footprint to save parameters and accelerates computation to enable seamless and fast running AI algorithms on edge systems.

Figure 2a depicts a typical convolution process that does not involve weight quantization; therefore, the weight is set to 32-bit floating-point data by default.

As shown in Figure 2b, each weight of the pre-trained AI algorithm is changed to an 8-bit integer type rather than a 32-bit floating-point type. As a result, it is expected that weight quantization can significantly lower the memory usage of AI systems.

An example illustrating the post-training weight quantization process of the proposed AI algorithm.

Figure 2. An example illustrating the post-training weight quantization process of the proposed AI algorithm. Image Credit: Huang, et al., 2021

As quantization noise arises when a continuous random variable is changed to a discrete one, it reduces weight precision and may lead to a loss in classification accuracy. Fortunately, researchers discovered that deep learning AI algorithms' weight precision is not very sensitive.

Results and Discussion

Datasets are essential in deep learning as AI systems rely largely on data.

A sufficient dataset should have at least 10 times the number of trainable parameters in an AI algorithm. To achieve this requirement, three datasets (daytime dataset, night-vision dataset, and mixed dataset) were created (see Table 3).

Table 3. Summary of three datasets generated for AI algorithms in this work. Source: Huang, et al., 2021

Dataset	Subset	Number of Samples	Percentage
Daytime dataset (5120 daytime images)	Training Set	3584	70%
	Validation Set	1024	20%
	Testing Set	512	10%
Night-vision dataset (5120 night-vision images)	Training Set	3584	70%
	Validation Set	1024	20%
	Testing Set	512	10%
Mixed dataset (10,240 daytime and night vision images)	Training Set	7168	70%
	Validation Set	2048	20%
	Testing Set	1024	10%

Figure 3 illustrates the conversion to night-vision images from daytime images.

An example of converting a daytime image into a night-vision image for infant sleep posture detection. The child’s face is hidden for privacy.

Figure 3. An example of converting a daytime image into a night-vision image for infant sleep posture detection. The child's face is hidden for privacy. Image Credit: Huang, et al., 2021

The current research employs TensorFlow and Keras to evaluate the proposed AI algorithm. Figure 4a depicts the conventional AI training and evaluation process while Figure 4b, illustrates the process post-weight quantization process.

(a) Traditional AI training and evaluation process without weight quantization, and (b) proposed AI training and evaluation process with post-training weight quantization.

Figure 4. (a) Traditional AI training and evaluation process without weight quantization, and (b) proposed AI training and evaluation process with post-training weight quantization. Image Credit: Huang, et al., 2021

The SGD optimizer trains all of the AI algorithm's weights. The learning rate is a hyper-parameter that governs how often the SGD optimizer adjusts weights to their optimal values.

The initial learning rate and its decay parameter were shown to be unaffected by the datasets in the studies.

The validation loss has a high initial value and decreases gradually. The experimental results of the loss and classification accuracy curves on both the training and validation sets of the daytime dataset are illustrated in Figure 5.

Simulation results of our AI algorithm on the daytime dataset before weight quantization.

Figure 5. Simulation results of our AI algorithm on the daytime dataset before weight quantization. Image Credit: Huang, et al., 2021

Figures 6 and 7 illustrate the experimental results of similar loss and accuracy curves for the night-vision and mixed datasets, respectively.

Simulation results of our AI algorithm on the night-vision dataset before weight quantization.

Figure 6. Simulation results of our AI algorithm on the night-vision dataset before weight quantization. Image Credit: Huang, et al., 2021

Simulation results of our AI algorithm on the mixed dataset before weight quantization.

Figure 7. Simulation results of our AI algorithm on the mixed dataset before weight quantization. Image Credit: Huang, et al., 2021

Weight quantization is then applied to these well-trained AI models. After the training phase is completed, experiments are done on the testing sets to determine the final test accuracy.

Due to memory constraints, these AI methods might not fit on edge computing systems like microcontrollers, tiny FPGAs, or low-end Raspberry PIs.

The memory footprint may be reduced by 88% via weight quantization. Table 4 lists the memory footprint details obtained from existing contactless camera-based AI algorithms.

Table 4. Comparison with existing contactless camera-based AI algorithms for baby sleep posture detection on the daytime dataset. Source: Huang, et al., 2021

Existing Work	Dataset	Weight Quantization	Memory Footprint	Test Accuracy
[17]	4250 daytime images	No	275 MB	88%
[18]	4250 daytime images	No	174 MB	89%
[19]	Baby doll pictures instead of real baby pictures	No	58.2 MB	N/A
[21]	1200 non-baby sleep images	No	175.7 MB	90.2%
This work	Daytime dataset (5120 images)	No	51.3 MB	90.8%
This work	Daytime dataset (5120 images)	Yes	6.4 MB	91.6%

Figures 8 to 11 show the results of the comparison of the existing AI algorithms by evaluation with the same mixed dataset. Table 5 illustrates the performance results of the existing AI algorithms.

Simulation results of the AI algorithm on the mixed dataset.

Figure 8. Simulation results of the AI algorithm on the mixed dataset. Image Credit: Huang, et al., 2021

Simulation results of the AI algorithm on the mixed dataset. I

Figure 9. Simulation results of the AI algorithm on the mixed dataset. Image Credit: Shadman, 2021

Simulation results of the AI algorithm on the mixed dataset.

Figure 10. Simulation results of the AI algorithm on the mixed dataset. Image Credit: Khan, 2021

Simulation results of the AI algorithm on the mixed dataset.

Figure 11. Simulation results of the AI algorithm on the mixed dataset. Image Credit: Tang, et al., 2021

Table 5. Comparison with existing contactless camera-based AI algorithms for baby sleep posture detection on the mixed dataset. Source: Huang, et al., 2021

Dataset		Weight Quantization	Memory Footprint	Test Accuracy	Comments
Mixed dataset (10,240 images)	[17]	No	275 MB	89.5%	Compared with these existing AI algorithms, this work reduces memory footprint by at least 89%, while maintaining similar classification accuracy.
	[18]	No	174 MB	89.3%
	[19]	No	58.2 MB	91.0%
	[21]	No	175.7 MB	91.1%
	This work	No	51.3 MB	89.9%
	This work	Yes	6.4 MB	89.7%

Figure 12 is a plot showing the visualization of the results presented in Table 5.

Performance comparison of test accuracy vs. memory footprint between this work and the existing state-of-the-art works in the literature

Figure 12. Performance comparison of test accuracy vs. memory footprint between this work and the existing state-of-the-art works in the literature. Image Credit: Huang, et al., 2021

Table 6 shows the probability of true negatives (TN), true positives (TP), false positives (FP), and false negatives (FN). The new algorithm has a substantially lower false-negative rate (i.e., 11%) than other algorithms, which have at least a 13% false-negative rate.

Table 6. Confusion matrix comparison with existing contactless camera-based AI algorithms for baby sleep posture detection on the mixed dataset. Source: Huang, et al., 2021

.	.	.
[17]	Negative (predicted)	Positive (predicted)
Negative (actual)	TN = 0.93	FP = 0.07
Positive (actual)	FN = 0.13	TP = 0.87
[18]	Negative (predicted)	Positive (predicted)
Negative (actual)	TN = 0.93	FP = 0.07
Positive (actual)	FN = 0.15	TP = 0.85
[19]	Negative (predicted)	Positive (predicted)
Negative (actual)	TN = 0.95	FP = 0.05
Positive (actual)	FN = 0.14	TP = 0.86
[21]	Negative (predicted)	Positive (predicted)
Negative (actual)	TN = 0.94	FP = 0.06
Positive (actual)	FN = 0.14	TP = 0.86
This work	Negative (predicted)	Positive (predicted)
Negative (actual)	TN = 0.92	FP = 0.08
Positive (actual)	FN = 0.11	TP = 0.89

The proposed AI algorithm may be readily implemented into a baby monitor, which typically enables two-way audio transmission, due to its lower memory requirement and improved detection accuracy.

Parents are not always in the vicinity of their children, particularly at night. Here, the proposed design automatically monitors infant sleep posture and delivers warnings or captures baby photographs, and sends them to their mobile phones.

Conclusion

To help mitigate the risk of SIDS, the current research created a huge and diversified dataset for AI training and evaluation that included 10,240 day and night-vision baby sleep photos. A CNN AI method is also proposed, and the post-training weight quantization technique is employed to reduce memory use.

Experiments show that the new AI system achieves good classification accuracy while using a modest amount of memory.

Compared to other existing literature, the suggested memory-efficient AI algorithm supports equivalent test accuracy of roughly 90% and consumes only 6.4 MB memory, implying at least a nine-fold memory reduction.

Continue reading: Accelerating Medical Diagnostics with AI

Journal Reference:

Huang, Q., Hsieh, C., Hsieh, J., Liu, C. (2021) Memory-Efficient AI Algorithm for Infant Sleeping Death Syndrome Detection in Smart Buildings. AI, 2(4), p. 705–719. Available online: https://www.mdpi.com/2673-2688/2/4/42/htm

References and Further Reading

Huang, Q (2018) Review: Energy-Efficient Smart Buildings Driven by Emerging Sensing, Communication, and Machine Learning Technologies. Engineering Letters, 26, pp. 320–332.
Huang, Q., et al. (2019) Rapid Internet of Things (IoT) Prototype for Accurate People Counting Towards Energy Efficient Buildings. Journal of Information Technology in Construction, 24, pp. 1–13.
Huang, Q & Hao, K (2020) Development of CNN-based visual recognition air conditioner for smart buildings. Journal of Information Technology in Construction, 25, pp. 361–373. doi.org/10.36680/j.itcon.2020.021.
Huang, Q., et al. (2017) Smart Building Applications and Information System Hardware Co-Design. In: Big Data Analytics for Sensor-Network Collected Intelligence; Elsevier BV: London, UK, pp. 225–240.
Gilbert, R., et al. (2005) Infant sleeping position and the sudden infant death syndrome: Systematic review of observational studies and historical review of recommendations from 1940 to 2002. International Journal of Epidemiology, 34, pp. 874–887. doi.org/10.1093/ije/dyi088.
Alfleesy, O (2016) Right-Side Sleeping Position Prevents Sudden Infant Death Syndrome a Literature Review. Journal of Forensic Science & Criminology, 4, p. 204. doi.org/10.15744/2348-9804.4.204.
Zhu, Z., et al. (2015) Wearable Sensor Systems for Infants. Sensors, 15, pp. 3721–3749. doi.org/10.3390/s150203721.
Bonafide, C., et al. (2018) Accuracy of Pulse Oximetry-Based Home Baby Monitors. The Journal of the American Medical Association, 320, pp. 717–719. doi.org/10.1001/jama.2018.9018.
Hasan, M & Negulescu, I (2020) Wearable Technology for Baby Monitoring: A Review. Journal of Textile Engineering & Fash Technology, 6, pp. 112–120. doi.org/10.15406/jteft.2020.06.00239.
Boughorbel, S., et al. (2010) Baby-Posture Classification from Pressure-Sensor Data. In: Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August, pp. 556–559.
Kim, Y. M., et al. (2018) Classification of Children's Sitting Postures Using Machine Learning Algorithms. Applied Science, 8, p. 1280. doi.org/10.3390/app8081280.
Liu, Z., et al. (2019) A Method to Recognize Sleeping Position Using an CNN Model Based on Human Body Pressure Image. In: Proceedings of the 2019 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS), Shenyang, China, 12–14 July, pp. 219–224.
Malik, A & Ehsan, Z (2020) Media Review: The Owlet Smart Sock—A "must have" for the baby registry? Journal of Clinical Sleep and Medicine, 16, pp. 839–840. doi.org/10.5664/jcsm.8400.
Moon, R Y & TFOSID (2016) SIDS and Other Sleep-Related Infant Deaths: Evidence Base for 2016 Updated Recommendations for a Safe Infant Sleeping Environment. Pediatrics, 138, p. e20162940. doi.org/10.1542/peds.2016-2940.
Perez-Pozuelo, I., et al. (2020) The future of sleep health: A data-driven revolution in sleep science and medicine. NPJ Digital Medicine, 3, p. 42. doi.org/10.1038/s41746-020-0244-4.
Grimm, T., et al. (2016) Sleep position classification from a depth camera using Bed Aligned Maps. In: Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December, pp. 319–324. doi.org/10.1109/ICPR.2016.7899653.
Huang, Q & Hao, K (2020) The Development of Artificial Intelligence (AI) Algorithms to Avoid Potential Baby Sleep Hazards in Smart Buildings. ASCE Construction Research Congress, pp. 278–287.
Shadman, R (2021) The Development of Neural Network Architectures for Image Classification to Prevent Sudden Infant Death in Smart Buildings. Master's Thesis, Southern Illinois University Carbondale, Carbondale, IL, USA.
Khan, T (2021) An Intelligent Baby Monitor with Automatic Sleeping Posture Detection and Notification. Artificial Intelligence. AI, 2, pp. 290–306. doi.org/10.3390/ai2020018.
Huang, G., et al. (2021) Densely Connected Convolutional Networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July, pp. 4700–4708.
Tang, K., et al. (2021) CNN-Based Smart Sleep Posture Recognition System. IoT, 2, pp. 119–139. doi.org/10.3390/iot2010007.
Szegedy, C., et al. (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June, pp. 2818–2826.
Zhou, Z., et al. (2019) Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing. Proceedings of the IEEE, 107, pp. 1738–1762. doi.org/10.1109/JPROC.2019.2918951.
Deng, S., et al. (2020) The Confluence of Edge Computing and Artificial Intelligence. IEEE Internet Things Journal, 7, pp. 7457–7469. doi.org/10.1109/JIOT.2020.2984887.
Gong, Z., et al. (2019) Diversity in Machine Learning. IEEE Access, 7, pp. 64323–64350. doi.org/10.1109/ACCESS.2019.2917620.
Althnian, A., et al. (2021) Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain. Applied Science, 11, p. 796. doi.org/10.1109/ACCESS.2019.2917620.
Gong, Z., et al. (2019) Diversity in Machine Learning. IEEE Access, 7, pp. 64323–64350. doi.org/10.1109/ACCESS.2019.2917620.
Fang, W., et al. (2018) Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach. Advanced Engineering Informatics, 37, pp. 139–149. doi.org/10.1016/j.aei.2018.05.003.
Wu, J., et al. (2019) Automatic detection of hardhats worn by construction personnel: A deep learning approach and benchmark dataset. Automation in Construction, 106, p. 102894. doi.org/10.1016/j.autcon.2019.102894.
Airaksinen, M., et al. (2020) Automatic Posture and Movement Tracking of Infants with Wearable Movement Sensors. Scientific Reports, 10, pp. 1–13. doi.org/10.1038/s41598-019-56862-5.
Bjorck, J., et al. (2018) Understanding Batch Normalization. In: Proceedings of the International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December, pp. 7705–7716.
Yang, G., et al. (2019) A Mean Field Theory of Batch Normalization. In: Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May, pp. 1–15.
Ioffe, S & Szegedy, C (2015) Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In: Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July, pp. 448–456.
Banner, R., et al. (2018) Scalable Methods for 8-bit Training of Neural Networks. In: Proceedings of the International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December, pp. 5151–5159.
Wu, S., et al. (2018) Training and Inference with Integers in Deep Neural Networks. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May, pp. 1–14.
Zhu, X., et al. (2016) Do We Need More Training Data? International Journal of Computer Vision, 119, pp. 76–92. doi.org/10.1007/s11263-015-0812-2.
Zheng, J., et al. (2021) Improving the Generalization Ability of Deep Neural Networks for Cross-Domain Visual Recognition. IEEE Transactions on Cognitive and Developmental Systems, 13, pp. 607–620. doi.org/10.1109/TCDS.2020.2965166.
Abadi, M., et al. (2016) TensorFlow: A System for Large-Scale Machine Learning. In: Proceedings of the ACM USENIX Conference on Operating Systems Design and Implementation, Savannah, GA, USA, 2–4 November, pp. 265–283.
Baldi, P., et al. (1985) Gradient descent learning algorithm overview: A general dynamical systems perspective. IEEE Transactions on Neural Networks and Learning Systems, 6, pp. 182–195.
LeCun, Y., et al. (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86, pp. 2278–2324. doi.org/10.1109/5.726791.
Senior, A., et al. (2013) An empirical study of learning rates in deep neural networks for speech recognition. In: Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May, pp. 6724–6728. doi.org/10.1109/ICASSP.2013.6638963.

Written by

Megan Craig

Megan graduated from The University of Manchester with a B.Sc. in Genetics, and decided to pursue an M.Sc. in Science and Health Communication due to her passion for learning about and sharing scientific innovations. During her time at AZoNetwork, Megan has interviewed key Thought Leaders across several scientific, medical and engineering sectors and attended prominent exhibitions worldwide.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Craig, Megan. (2022, February 21). Developing an AI Algorithm to Monitor Infant Sleep Positions. AZoRobotics. Retrieved on July 11, 2025 from https://www.azorobotics.com/Article.aspx?ArticleID=441.
MLA
Craig, Megan. "Developing an AI Algorithm to Monitor Infant Sleep Positions". AZoRobotics. 11 July 2025. <https://www.azorobotics.com/Article.aspx?ArticleID=441>.
Chicago
Craig, Megan. "Developing an AI Algorithm to Monitor Infant Sleep Positions". AZoRobotics. https://www.azorobotics.com/Article.aspx?ArticleID=441. (accessed July 11, 2025).
Harvard
Craig, Megan. 2022. Developing an AI Algorithm to Monitor Infant Sleep Positions. AZoRobotics, viewed 11 July 2025, https://www.azorobotics.com/Article.aspx?ArticleID=441.