African Society for Bioinformatics and Computational Biology

ASBCB Omics Codeathon – April 2023

Omics codeathon is an established event where life scientists work on research projects. It is held twice a year and led by Olaitan I. Awe, the current training officer for the African Society for Bioinformatics and Computational Biology (ASBCB).

The codeathons have been organized through a collaboration between ASBCB and the National Institutes of Health, Office of Data Science Strategy (NIH ODSS).

The primary aim of the omics codeathon is to use omic sequences and bioinformatics to advance the understanding of the biology of model organisms and pathogens to ultimately improve human health. Our research projects typically use human, cellular, cancer and pathogen genomics to investigate the molecular mechanisms of mendelian disorders and complex traits. By using different omics approaches, we are interested in trying to understand how diseases work at the molecular level.

Codeathon participants and applicants come from across the globe, including South Africa, Kenya, Nigeria, Tunisia, Algeria, Zimbabwe, Morocco, Egypt, Senegal, Mali, Ghana, Brazil, Uganda, Poland, Tanzania, Mozambique, Bangladesh, Botswana, Burkina Faso, Gambia, Benin Republic, Ethiopia, South Korea, Saudi Arabia, Sudan, Malawi, Zambia, India, Taiwan, China, Sweden, Finland, Germany, France, Ireland, Spain, Australia, Mexico, United States, United Kingdom and Canada.

The April 2023 codeathon was virtual. The projects were in these categories; Bulk Transcriptomics, Pathogen Genomics, Metagenomics, Human Genomic Variation, Pipeline Development, Biomarker Discovery, Cheminformatics, Clinical Applications, Drug and Vaccine Design, Antimicrobial Resistance, Population Genomics, Genome Wide Association Studies, Polygenic Risk Scores, Mendelian Randomisation, Oncology, Structural Bioinformatics, Software Development, Epigenomics and Machine Learning.

Projects

Title
Team
Project Description

Bacterial Diversity in the CoVID-19 Metagenome: An AI-aided Approach

Nouhaila En najih, Bonface Onyango, Edward Tettevi, Eddie Lulamba, Sibongiseni Msipa, Fatima Zahra Annassiri, Fatma Omar and Olaitan I. Awe

In this project, we proposed a novel pipeline that integrates the power of three machine learning models to predict COVID-19 patients by analyzing the human gut metagenome. We performed a comprehensive analysis of the gut microbiota profiles of COVID-19 patients and healthy controls to identify potential biomarkers associated with the disease. Our research makes a major contribution to the ongoing effort to improve COVID-19 diagnosis and treatment. Insights gained from our study could help us better understand the interplay between gut microbiota and viral infection and pave the way for future research in this field. Our aim is to highlight the potential of machine learning-based approaches to improve our understanding of complex communicable diseases such as COVID-19 and provide a roadmap for the development of personalized medicine strategies.

Exploring the Causal Effect of Omega 3 Polyunsaturated Fatty Acid Levels on the Risk of Type 1 Diabetes: A Mendelian Randomization Study

Lydia Abolo, Joachim Ssenkaali, Onan Mulumba and Olaitan I. Awe

The impact of Type 1 diabetes (T1D) is immense, with nearly 8.4 million individuals worldwide affected by the condition as of 2021. This number is expected to rise over the next few decades. A diagnosis of T1D equates to a lifelong risk of poor control of blood glucose levels which can result in various debilitating complications. Despite the use of insulin therapy for nearly a century to manage hyperglycemia in T1D, there are currently no treatments available to address its underlying causes. While observational studies have suggested that sufficient dietary intake of omega-3 fatty acids (ω-3) may be associated with a reduced risk of developing T1D, the evidence for its benefits remains inconclusive. In this study, we used a Mendelian randomization (MR) approach to overcome some of the limitations posed by observational studies to further investigate the T1D- ω-3 association. We utilized publicly available data from a genome-wide association study (GWAS) of ω-3 polyunsaturated fatty acids in a sample of 114,999 European participants, and a meta-analysis of 12 GWAS on T1D, which included a total of 9,358 cases and 15,482 controls of European ancestry. Single nucleotide polymorphisms (SNPs) that were associated with ω-3 were identified and used as proxies to investigate the potential causal relationship between ω-3 and T1D.

Integrated Analysis of Genetic Variation, Gene Expression and Methylation Changes in Epilepsy

Marion N. Nyamari, Modibo K. Goita, Shamim Osata and Olaitan I. Awe

Epilepsy is a complex neurological disorder that can be caused by both genetic and environmental factors. Genetic changes include copy number variants, point mutations, and chromosomal abnormalities that affect ion channels, neuronal signaling, and brain development. Epigenetic changes include modifications to DNA that can alter gene expression without changing the underlying DNA sequence, and may be involved in the regulation of genes important for neuronal function. Different genes and epigenetic modifications have been associated with different types of epilepsy, and identifying these changes can provide a better understanding of the condition.
In this study, we identified the differentially expressed genes, differentially methylated regions and SNPs in patients with epilepsy by using RNA-seq and the analysis of the epigenome. The results provide valuable insights into the molecular mechanisms underlying epilepsy and this can potentially lead to the development of more effective treatments.

Designing a Multi-epitope Vaccine Targeting Enterotoxin B in Staphylococcus aureus

Vanessa Natasha Onyonyi, Jimmy Nkaiwuatei and Sisay Teka Degechisa

The emergence of antimicrobial resistance (AMR) among Staphylococcus aureus (S. aureus) strains has exacerbated the infection by these pathogens. Despite various research efforts, still there is no effective vaccine against S. aureus infections.
In this study, we constructed a multi-epitope vaccine as a potential vaccine candidate for S. aureus. The protein of choice was Staphylococcal Enterotoxin B (SEB), a surface toxoid associated with food poisoning. The vaccine construct that we created in this study was observed to be a non-allergenic, non-toxic antigen with a higher antigenicity than the native SEB protein. A variety of computational methods were utilized to locate epitopes that may be exploited in vaccine construction. The final vaccine construct had four hundred and one amino acids, a half-life of >10 hours in E. coli (in-vivo), and a stability index of 31.45 which classifies it as a stable construct.

Investigating the Genetic Association between Type 2 Diabetes and Dementia

Nour ElHouda Barhoumi, Yusuf Danasabe Jobbi, Enas A. Fouad-ElHady and Olaitan I. Awe

In this study, we utilized microarray data from the NCBI Gene Expression Omnibus (GEO) database to provide molecular and clinical insight related to biologically-significant information that explains the genetic association between type 2 diabetes (T2D) and dementia. We did the analysis using bioinformatics tools in order to study a wide range of molecular pathways, gene ontology (GO), cellular components/localization, and drug targets information that is required in understanding the association between these two diseases and their complexities.
We used the Enrichr for gene expression analysis and to determine the differentially expressed genes (DEGs) and protein-protein interactions (PPI) between brain astrocyte, endothelial cells, and neurons of diabetic patients as cases (D) and marched controls (C) and we noted some findings that support the hypothesis of an association between T2D and dementia. This work may add more insight and clarify the link between these two diseases. It also provides novel approaches for therapeutic intervention as drug targets.

In-silico Identification of Potential Inhibitors of Plasmepsins and Falcipains in Malaria

Henry Ndugwa, Mamadou Sangare, Florence Mbaoji, Lassana Coulibaly, Helga Saizonou, Kehinde Adeniran, Oluwasegun A. Babaleye and Ojochenemi A. Enejoh

Plasmodium falciparum malaria has long been a significant cause of morbidity and mortality in sub-Saharan Africa. There is an urgent need for improved and more effective treatment alternatives as a result of increasing Plasmodium species resistance to currently available medications. New drugs targeting new sites within the Plasmodium falciparum parasite are needed in order to counter malaria fever. Eliminating the proliferation of parasites by concentrating on the proteins necessary for their survival is one method for treating the disease. In this study, we sought to identify possible inhibitors of two important proteases that are essential to the survival of these parasites: falcipain and plasmepsin. Using machine learning and in-silico molecular docking techniques, we screened over 400,000 compounds against Falcipain 2 and 3, and Plasmepsin II and V, which were retrieved from a public database. Then, we predicted the ADME (absorption, distribution, metabolism, and elimination) properties and drug-likeness.

NeuroVar: A Tool for Visualizing Gene Expression and Genetic Variation in Biomarkers of Neurological Diseases

Hiba Ben Aribi, Najla Abassi and Olaitan I. Awe

NeuroVar is a novel tool for visualizing genetic variation (SNPs and Indels) and gene expression data that focuses on biomarkers of neurological diseases.
In this study, we developed a novel R shiny app and software, named NeuroVar, to visualize gene expression and genetic variant calling data from VCF/CSV files.
Our tool is available as a desktop application that does not require any computational skills to use and it provides a user-friendly graphical user interface (GUI). Apart from the shiny app, our goal is to have a freely accessible tool which is platform-independent and can be used on computers running Windows, MacOS and Linux operating systems.

Genetic Variant Profiling of Multiple Sclerosis Patients

Ann Laigong, Ichrak Benamri, Chimenya Ntweya, Olaide Damilola and Olaitan I. Awe

Multiple sclerosis is a chronic inflammatory disorder of the central nervous system, which affects more than 2 million people around the world. It is defined by the autoimmune degradation of myelin and the subsequent loss of neurons and is prevalent in individuals between the ages of 20 and 40 years.
Recent research links multiple sclerosis to genetic variations in the gene encoding locus for epigenetic machinery but little is known about how the genetic and epigenetic factors work together to influence susceptibility to MS at the locus level.
In this study, we analysed whole exome data of multiple sclerosis patients and controls in order to explore the molecular mechanisms underlying causal variants and to identify variants that might escape detection by conventional genetic studies.

A Portable Pipeline for Pneumonia and CoVID-19 Detection from Chest Images Using Machine Learning

Lawrence Muwonge, Brenda Kamau and Damilola Olanipon

The detection of Pneumonia and COVID-19 from chest images greatly depend on the expertise of the radiologist and can be prone to biases and inaccurate diagnosis. With the advent of machine learning, there is an opportunity to develop more accurate and efficient methods for detecting pneumonia and COVID-19 from chest images.
In the study, we utilized publicly available chest images, and used them to train a convolutional neural network model. Then, we deployed the trained model in pneumosarscov2 mobile application using ondevice machine learning. The pneumosarscov2 application is able to detect pneumonia, COVID-19 and normal cases from chest images and therefore, can be used by clinicians to diagnose Pneumonia and COVID-19 from chest images.

Polygenic Risk Score for Cancer Disease in African Populations

Daniel Adediran, Precious Adebola and Abdulrahman Olagunju

In this study, we assessed the representation of African populations in cancer Polygenic Risk Score (PRS) studies and determined the clinical utility of calculated PRS for cancer patients of African ancestry.
PubMed was searched for Polygenic risk score and cancer in African populations. Studies were included if they were written in English, focused on cancer, African ancestry, and PRS.
It remains essential to have more extensive application of PRS to study cancer traits in African populations. Such research would strengthen clinical utility by providing information to better understand cancer traits, enabling further improvement in clinical end results for the African populations.

Comparing Assembly-based and Read-based Methods for Identifying Antimicrobial Resistance Genes in Cattle

Kauthar M. Omar, Stanley Omondi Onyango and Justice T. Ngom

The menace of antimicrobial resistance is in the rise but there is no clarity on the choice of approach to characterize resistance from metagenomics samples.
Our study aims to compare antimicrobial resistant genes information, computational resource and time requirements in read and assembly-based approaches to determine antimicrobial resistant genes from cattle metagenomics samples. We used publicly available Tanzania cattle shotgun metagenomic samples for the study and are looking forward to expand our study to African countries cattle metagenomic samples.