Project

Objective

Have hands-on experience in designing and implementing data science experiments for addressing geophysics problems.

Topics

You should choose one of the following challenges. Follow instructions below to prepare your results according to the announced schedule.

This is a general idea of the topic. It can be adjusted in the following days.
Challenge 1:
— Description: Estimating Earthquake Epicentres
— Dataset: Challenge 1 dataset

N.B This challenge was proposed during a Summit of the project ADAGEO with the Earth Sciences department of UFRN, Brazil. Announced mentors are no longer available so contact me on slack if you have questions: Curation Slack Channel

Challenge 2:
— Description: if you are not willing to perform analytics and rather address meta-data management. Then you can create an ATLAS repository with as much meta-data as possible of datasets regarding your research or those proposed as follows.
— As starting point run the second ATLAS hands-on available here: [HO-2b]
— Datasets and context description:

N.B For questions contact me on slack: Curation Slack Channel

General Rules

  1. Individual work or groups of 2 participants, if possible, gender/regional/discipline-balanced.
  2. Notebooks should be publicly accessible online (Kaggle, Colab, Github).
  3. Groups can use libraries, methods and any other material required respecting authors’ intellectual property.
  4. Short demo – video pitch specifying image privacy of authors.

To Handin

  1. Notebook with the complete profiling, study and preparation of your data. Use plots whenever possible (this should be done for either of the two challenges).
  2. If data is prepared or engineered, a link to the repository of the prepared dataset
  3. Notebook with the solution with the explanation of the principle, of the phases of the study with partial results interpretation, results and fundamental the adopted assessment method. Use plots whenever possible.

Schedule

  • Proposal of the project: 22/11/2023
  • First control (data profiling and exploration): 15/12/2023
  • Second control (problem statement and first strategy): 10/01/2023
  • Final results (pipeline, insight, and some degree of reproducibility): 20/01/2023