Learning from Data (Fall 2024)

Dataset & Project Ideas

Below are some potential datasets and topics for your class project. Please contact the course staff if you have questions about these datasets.

1. Planning and Behavior Synthesis with Generative Models

Leverage generative models (e.g., diffusion models) for planning in imitation learning or multi-agent systems.

Project ideas:

Compare generative planning with traditional planning approaches (e.g., A*, MCTS).
Explore how diffusion models can model high-dimensional action spaces

Reference papers:

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Planning with Diffusion for Flexible Behavior Synthesis

2. AI for Construction Site Safety Monitoring

By combining machine learning techniques with construction site safety, we can create systems that not only predict and prevent accidents but also automate and optimize safety measures in real-time.

Project ideas:

Improving Safety Prediction under Noisy Data. Construction site data, especially from wearables or environmental sensors, may have noise or missing values. Develop robust algorithms to handle noisy, incomplete, or inconsistent data to improve safety prediction models.
Anomaly Detection for Safety Incidents. Develop an anomaly detection system that identifies unusual or potentially dangerous conditions in construction site data (e.g., unusual vibrations from equipment, irregular worker behavior).

Reference papers:

Human-object interaction recognition for automatic construction site safety inspection
Deep learning for site safety: Real-time detection of personal protective equipment
Improved detection network model based on YOLOv5 for warning safety in construction sites

Related Datasets:

Construction site object: SODA dataset, MOCS dataset, ACID dataset
Buildings: BCS dataset
Worker behavior: CMA dataset

3. Task compositionally

How to breakdown complex tasks into simpler component tasks and combining them to form solution? This question can be studied in both the supervised learning (e.g. question answering, visual perception), unsupervised learning (e.g. generative model) and reinforcement learning (e.g. skill planning) scenarios. How to decompose multimodal tasks? One can also take a theoretical perspective to investigate when composed model can have better performance than a single model.

Reference papers:

Compositional Generative Modeling: A Single Model is Not All You Need (generative model)
Planning with Diffusion for Flexible Behavior Synthesis; (RL)
Explorer, discover and learn: Unsupervised discovery of state-covering skills (RL)
Scalable multi-agent covering option discovery based on kronecker graphs; (MARL)
A Complexity-Based Theory of Compositionality (theory)

4. Sparse Coding in Neural Networks:

Investigate how integrating sparse coding principles into neural network architectures can improve model interpretability and efficiency, particularly in resource-constrained environments.

Reference papers:

Deep Dictionary Learning: A PARametric NETwork Approach

5. Robust Dictionary Learning under Noise

Investigate methods to enhance the robustness of dictionary learning algorithms in the presence of noise and outliers, with applications in audio and image processing.

Reference paper:

Deep Convolutional Dictionary Learning for Image Denoising

6. Out-of-distribution Detection

Current NNs can give likelihood and probability for known classes. When a new class or different style emerge, they may not perform well. Investigate and Design a framework to detect new unseen classes.

Reference paper:

How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?

7. Stable Chain-of-thought

Chain-of-thought has been developed in LLM systems recently. However, how to improve its stability and effectiveness in diverse tasks remain a challenge. Try to combine the methods you have learnt to improve it.

Reference paper:

Why Can Large Language Models Generate Correct Chain-of-Thoughts?

8. Multiple descent behavior in machine learning

Analyze the condition of multiple descent behavior in context other than supervised learning (e.g. unsupervised learning, graph learning, etc).

Reference paper:

Reconciling modern machine-learning practice and the classical bias–variance trade-off