Research projects and collaborations

Understanding Pedagogical Approaches for Large Language Models (LLMs) in Programming Education


Summary: This project explores how LLMs can be used as a programming aid in quantitative courses. The main goals comprise:

  • Identify current approaches regarding the use of closed LLM-based tools, such as ChatGPT, GitHub Copilot, and Bard, in programming tasks, based on literature review and educational case studies.
  • Assess the feasibility and efficacy of incipient open LLMs, such as StarCoder, as candidate tools for the design of teaching materials related to programming education.
  • Test different LLMs capabilities, such as a) text to code, b) code to code, and c) code to text.
  • Design teaching resources and encourage discussion around practices involving AI-assisted coding tools.

    Project details:
  • Period: 2024-2025
  • Funding: LSE Eden Centre - Eden Development Fellowships

GENIAL - GENerative AI Tools as a Catalyst for Learning


Summary: A collaborative focus group on generative AI tools and their use for teaching and learning. Joint research and development project with Dr. Jonathan Cardoso-Silva (DSI).

    Project details:
  • Period: 2023-2024
  • Funding: LSE Eden Centre - Scholarship of Teaching and Learning/Catalyst Fund
  • Useful links: GENIAL Website

Decolonising data science teaching and learning


Summary: Does the way data science is taught adequate and inclusive, or is everyone forced to follow the same approach? Do the datasets and case studies used for teaching and evaluation be representative of diverse contexts and historical perspectives, or do they represent a biased vision of such contexts? Do data science assessment activities effectively measure students' critical thinking and innovative skills, or do they simply assess their technical ability to memorize and repeat the same solutions? This education-focussed project concentrates on two aspects of data science in ligth of decolonising: teaching and assessment. We will investigate these and other questions, and produce some guidelines for incorporating decolonising concepts into teaching and assessment resources related to data science courses.


Evaluating effects of social inequalities on the COVID-19 pandemic in a low- and middle-income country


Summary: This project aims at to create a Social Disparities Index (SDI) to measure inequalities relevant to the COVID-19 pandemic, such as unequal access to healthcare and regions more vulnerable to infection. In Brazil, markers of inequality are associated with COVID-19 morbidity and mortality. IDS will capture these markers from COVID-19 surveillance data and build a public visualisation dashboard to share the index and patterns of COVID-19 incidence and mortality with the broader community. This will enable health managers and policymakers to monitor the pandemic situation in the most vulnerable populations and target social and health interventions.


AI as a service for tackling COVID-19 in Brazil


Summary: This project aims to establish a cloud-based AI platform to support research and inform decisions related to Covid-19 in Brazil. The emphasis will be on the activities conducted by Rede CoVida, a Brazilian network of around 180 academics, policymakers, health workers, and the general public established in March/2020 to i) monitor the spread of the disease in Brazil, ii) design multi-purpose, real time prediction models, and iii) synthesize and disseminate scientific evidence. We will focus on the following research goals: i) design of a large-scale data lake and integration platform; ii) design and validation of mixed AI models for prediction and decision-making support; and iii) design of an interactive bibliometrics platform focusing on synthesis of evidence and correlations within the increasing literature related to Covid-19.


Alert-early system of outbreaks with pandemic potential


Summary: This collaboration aims at to design a data-driven system for early-warning of respiratory viral disease outbreaks contributing to preparedness against epidemics.


Risk of chronic clinical condition following previous hospitalisations by psychiatric disorder


Summary: This project aims at increasing knowledge over the relationship of mental disorders and other chronic conditions to ameliorate the lives of those affected. More specifically we want to i) estimate the the risk of hospitalisations or death by diabetes mellitus, cardiovascular diseases or stroke following a hospitalisation due to depressive disorders, alcohol and substance use-related disorders, and schizophrenia; ii) estimate the risk of the occurrence or death by tuberculosis following a hospitalisation due to depressive disorders, alcohol and substance use-related disorders, and schizophrenia; iii) investigate how these chronic conditions goes together in clusters and how these patterns evolve over time and ageing.

    Project details:
  • Period: 2020 - 2022
  • Team: CIDACS, UFBA, University College London, London School of Hygiene and Tropical Medicine
  • Funding: Global Multimorbidity Seed Funding (UKRI)
  • Useful links: UKRI Global multimorbidity: seed funding 2019

Scaling up multimodal data fusion and analytical models over multiple-GPU systems


Summary: This project focuses on the exploitation of multi-GPUs systems to i) accelerate our probabilistic data fusion tool (AtyImo), more specifically preprocessing and data linkage methods, and ii) deploy and validate complex machine and deep learning models to analyze huge amounts of data built from Brazilian socioeconomic and public health care databases.

    Project details:
  • Period: 2019 - 2021
  • Team: UFBA, SENAI-CIMATEC
  • Funding: Large-scale Applied Data Science (NVIDIA)
  • Useful links:

Design and validation of personalised risk prediction models over Brazilian health care data


Summary: This project aims at to i) define a set of diseases, at individual and municipality level, for which risk prediction models can effectively contribute to early detection and/or guidance of treatment; ii) establish proof-of-concept studies; iii) identify existing models adjustable to the Brazilian population; iv) perform deep experimentation of the proposed models; and v) generate a set of results to be validated by a panel of epidemiologists and statisticians, as well as governmental staff.

    Project details:
  • Period: 2019 - 2021
  • Team: UFBA, University College London
  • Funding: Newton International Fellowship Follow-on Funding (The Royal Society)
  • Useful links:

Standardisation of wearable-based algorithms for healthcare applications in developing countries


Summary: This project aims at to develop a novel standardised framework to better inform algorithms for a more harmonised gait assessment in Parkinson's disease (PD), particularly for developing countries where guidance is lacking. This project will lead to the design of an online simulation tool to test algorithms. Additionally, it will outline an educational process for all clinicians to better understand the functionality of wearables/algorithms and resulting outcomes. This will better guide PD assessment for sustainable health, promoting and encouraging low-cost wearables as routine diagnostics in developing countries. This framework will also be adapted to the needs of those in developed regions.

    Project details:
  • Period: 2018 - 2019
  • Team: Northumbria University, Insituto de Biociências de Rio Claro, University of Birmingham, University College London, UFBA
  • Funding: Frontiers of Engineering Seed Funding (Royal Academy of Engineering)
  • Useful links: RAEng current and recent awards

IMAPI - early childhood friendly municipal index


Summary: IMAPI was created to describe municipal contexts less or more favorable to early childhood development in Brazil and to support decision-making about early childhood. It has 31 indicators related to the provision of public policies, actions, and services, as well as family practices aimed at child development that reflect the five domains of the Nurturing Care Framework recommended by the World Health Organization, UNICEF and World Bank.

    Project details:
  • Period: 2018 - 2020
  • Team: UnB, UFBA, Yale School of Public Health (USA), São Paulo Health Institute
  • Funding: Grand Challenges Explorations: Data science approaches to improve maternal and child health in Brazil (Gates Foundation), CNPq (Brazilian National Research Council)
  • Useful links: IMAPI Website

Integrating socioeconomic and health data to combat malaria


Summary: This project aims at to build a platform that routinely integrates data from malaria surveillance systems with healthcare data (incidence and hospitalization) and socioeconomic data (income and living conditions) captured from Brazilian governmental systems. An interactive visual mining dashboard will provide open access and support for data analysis, including forecast and multilayer visualisation models.

    Project details:
  • Period: 2016 - 2019
  • Team: UFBA, Health Surveillance Foundation (Amazonas), Oswaldo Cruz Foundation (FIOCRUZ)
  • Funding: Design New Analytics Approaches for Malaria Elimination (Round 17) (Gates Foundation)
  • Useful links: Global Grand Challenges, Malaria database and visual analytics tool

Treating heterogeneity and uncertainty in data integration: case study on Brazilian databases


Summary: This project aims at to i) design and validation of a data integration model and related computing tools addressing heterogeneity, uncertainty and scalability targeted to big data integration; ii) support for some Brazil-UK ongoing projects: the 100 million cohort, the surveillance platform for Zika and microcephaly, and predictive analytics methods applied to malaria data (Post-doctoral research).

    Project details:
  • Period: 2016 - 2018
  • Team: UFBA, University College London
  • Funding: Newton International Fellowships (The Royal Society, UK)
  • Useful links: Denaxas Lab

Design of a scientific repository (data lake) for big data applications


Summary: This project aims at to design and deploy a data repository (data lake) for big data applications. The first prototype comprises malaria surveillance data to support predictive analytics.

    Project details:
  • Period: 2016 - 2021
  • Team: UFBA, Health Surveillance Foundation (Amazonas), Oswaldo Cruz Foundation (FIOCRUZ)
  • Funding: Bahia State Research Agency (FAPESB)
  • Useful links:

BAMBU - metropolitan network for trial and innovation on future internet


Summary: This project aims at to develop and implement an experimental metropolitan network for trial and innovation on future internet issues. This network will be based on the REMESSA existing network. Besides serving as an experimental sandbox for educational and research institutions of Bahia, we plan to link BAMBU with other national and international networks, through the FIBRE project.

    Project details:
  • Period: 2015 - 2020
  • Team: UFBA, IFBA, Oswaldo Cruz Foundation (FIOCRUZ), RNP, LNCC, UFES, Florida International University, Philips
  • Funding: Bahia State Research Agency (FAPESB)
  • Useful links: BAMBU WebHome

Computational infrastructure to support big data applications in health


Summary: This project aims at to design a middleware for probabilistic record linkage of governmental databases: Cadastro Único (socioeconomic data), PBF (payments from Bolsa Família) and SUS (Brazilian National Health System). This middleware will provide data warehouse (ETL) routines for data quality assessment, data cleansing, and anonymization, as well as a Spark-based execution engine to support data linkage from these databases. The generated data marts are used by statisticians and epidemiologists to assess the efficiency of social programmes related to the incidence of some diseases (leprosy, tuberculosis, HIV/AIDS) on the beneficiary population.

    Project details:
  • Period: 2014 - 2016
  • Team: UFBA, Oswaldo Cruz Foundation (FIOCRUZ)
  • Funding: Early Doctor Research Grant (UFBA)
  • Useful links:

Cloud computing infrastructure to support Bioinformatics and Robotics applications


Summary: This project aims at to i) improving our BOINC implementation designed for the GT-MC2 and ii) developing a new implementation to support highly distributed applications based on Hadoop. We evaluated a number of Bioinformatics applications in both implementations (SGA for BOINC and SGA for Hadoop). We are also considering the utilization of hybrid parallel architectures (multicore + multi-GPU) in order to efficiently run these applications.

    Project details:
  • Period: 2013 - 2015
  • Team: UFBA, UNEB, Polytechnic University of Valencia (UPV)
  • Funding: Scientific Initiation Grant (UFBA)
  • Useful links:

JiT-Clouds: highly scalable infrastructure-as-a-service


Summary: JiT-Clouds is a research effort carried out by a group of Brazilian Universities and Research Centers, sponsored by the Centro de Pesquisa e Desenvolvimento em Tecnologias Digitais para Informação e Comunicação (CTIC) held by the Ministry of Sciences and Technology. It aims at developing an alternative way to build public cloud infrastructures, based on the concept of Just-in-Time (JiT) deployment of the computing infrastructure.

    Project details:
  • Period: 2011 - 2013
  • Team: UFCG, UFRGS, UFBA + 11 other universities and research laboratories
  • Funding: CTIC (Brazilian Ministry of Sciences and Technology)
  • Useful links: CTIC - JitClouds

GT-MC2: my scientific cloud


Summary: MC2 is a cloud computing platform aimed to support e-science applications. It provides access to a large amount of computational resources for brief time intervals, storage, reproducibility of experiments and control of data provenance. This platform uses a PaaS model, allowing for the easy development and deployment of customized services and portals, accessed at the SaaS level. At the IaaS level, MC2 employs a broker to efficiently provide access to high performance clusters, volunteer computing resources (based on BOINC), peer-to-peer computing resources (based on OurGrid) and cloud resources (based on Eucalyptus).

    Project details:
  • Period: 2011 - 2013
  • Team: LNCC, UFCG, UFBA, UFC, UFRGS
  • Funding: CTIC/RNP (Brazilian Ministry of Sciences and Technology)
  • Useful links: RNP

Analysis of performance models applied to high-performance hybrid architectures


Summary: This project aims at to study performance models used for high performance processing in hybrid architectures composed by multicore CPUs and manycore GPUs. We want to identify and measure some metrics related to performance and processing capacity/elasticity, as well as limitations related to application execution, tools for applications development and other aspects related to each architecture. A set of applications belonging to different classes (highly coupled, bag of tasks and data-intensive) will be evaluated in terms of their requirements (resources needed, data movement etc), aiming at to define a set of operating characteristics for each class. As major outcomes, the project must generate a detailed analysis on the suitability of current performance models applied to hybrid architectures and propose some extensions in order to efficiently support such architectures.

    Project details:
  • Period: 2011 - 2013
  • Team: UFBA, UNEB, UNIVASF
  • Funding: Scientific Initiation Grant (UFBA)
  • Useful links:

GT-UniT: monitoring the BitTorrent universe


Summary: This project aims at to develop a software infrastructure to monitor BitTorrent networks. The specific goals comprise the monitoring of Portuguese content, the popularity of specific contents and the traffic observed in some sub-networks. Experiments were executed in 6 servers hosted in the Brazilian internet backbone (points of presence) and more than 95 nodes in PlanetLab.

    Project details:
  • Period: 2010 - 2012
  • Team: UFRGS, UFCG
  • Funding: CTIC/RNP (Brazilian Ministry of Sciences and Technology)
  • Useful links: RNP

PMM: modular multimedia platform


Summary: Design of a middleware and applications for a modular multimedia platform, offering services for digital video recording and interaction focused on digital television.

    Project details:
  • Period: 2006 - 2008
  • Team: UFRGS, UNILASALLE, UFSC, Digitel
  • Funding: FINEP (Brazilian Ministry of Sciences and Technology)
  • Useful links:

MultiCluster: support for parallel programming on multiple clusters


Summary: This project aims at to define an integration model for heteregeneous cluster-based architectures composed by Myrinet, SCI, and Fast Ethernet. The main goals are to identify hardware and software requirements and provide a complete programming environment that allows the user to configure such architecture and distribute tasks according to his application needs. For such, we integrate different DECK implementations and use JXTA to aggregate resources from heterogeneous clusters (PhD research).

    Project details:
  • Period: 2000 - 2006
  • Team: UFRGS, Universität Paderborn, Laboratoire d'Informatique de Grenoble (LIG/UJF)
  • Funding: CAPES (Brazilian Ministry of Education) - PhD Fellowship
  • Useful links: UFRGS - LUME Repository

DECK: parallel programming applied to cluster computing


Summary: This project focuses on the development of a parallel programming library called DECK (Distributed Execution and Communication Kernel) applied to clusters composed by different communication technologies (Fast Ethernet, Myrinet, and SCI). We developed a DECK version for each communication technology and evaluate its performance against MPI and Athapascan-0 (MSc research).

    Project details:
  • Period: 1998 - 2000
  • Team: UFRGS, Laboratoire d'Informatique de Grenoble (LIG/UJF)
  • Funding: CAPES (Brazilian Ministry of Education) - MSc Fellowship
  • Useful links: UFRGS - LUME Repository

DPC++: distributed processing in C++


Summary: DPC++ applies object-orientation as a basis for distributed programming. The main focus is to extend the C++ programming language with abstractions for object distribution and communication, as well as a good load balancing among the resources. The user is not aware of such operating aspects as the DPC++ preprocessor performs all operations needed to distribute, communicate and coordinate distributed tasks and objects.

    Project details:
  • Period: 1995 - 1998
  • Team: UFRGS
  • Funding: CNPq (Brazilian National Research Council)
  • Useful links:

ArMA-GAPP: study and application of vector architectures


Summary: This project uses a vector processor architecture (NCR GAPP) and some C-based tools we have developed to run and evaluate image processing applications.

    Project details:
  • Period: 1993 - 1994
  • Team: UFRGS
  • Funding: CNPq (Brazilian National Research Council)
  • Useful links: