Extrator de dados do jupiterweb

Overview

Extrator de dados do jupiterweb

O programa é composto de dois arquivos:

  • Um constando apenas de classes complementares que representam as unidades e as disciplinas
  • Outro que executa o processo de extração dos dados do jupiterweb

Em essência o programa faz um get para a pagina do jupiterweb que contem a lista das unidades, onde extra quais sao as unidades ativas e algumas informçoes sovre elas (nome e código da disciplina).

A partir da informação, extrai-se quais as disciplinas são ministradas e, em caso de encerradas, foram ministradas pelas unidades. Destas disciplinas existem as informaçoes basicas (Codigo, nome, data de ativação e desativação) e informaçoes complementares, que podem ser obrigatorias ou não (Creditos, Metodo, Docente, Tipo de recuperação, etc)

No codigo principal existe um método toJson que importa os dados extraidos relativo a cada unidade e suas disciplinas para um arquivo no formato Json.

O formato do arquivo Json é:

{
 codigo_da_unidade: {
   nome: "nome da unidade",
   code: "codigo_da_unidade",
   disciplinas: {
     codigo_da_disciplina: {
          nome: "nome_da_disciplina",
          codigo: "codigo_da_disciplina",
          ativacao: "data_de_ativacao",
          desativacao: "data_de_desativacao",
          credito_aula: "numero_de_creditos",
          credito_trabalho: "numero_de_creditos",
          tipo: "semestral/anual",
          objetivos: "objetivos_da_disciplina",
          docentes: "docentes",
          programa: "programa_da_disciplina",
          programa_resumido: "programa_resumido_da_disciplina",
          metodo: "metodo_de_avaliacao",
          criterio: "criterio_de_aprovacao",
          norma_de_recuperacao: "tipo_de_recuperacao",
          bibliografia: "bibliografia_da_disciplina"
    },
    #Outras_disciplinas_da_unidade#
    }
  },
  #outras_unidades#
}

Consta no repositório um arquivo extraído do jupiterweb no dia 02/12/2021 como exemplo. Lembrando que eventuais mudanças de layout no jupiterweb podem interferir no desempenho e bom funcionamento do algoritmo, ja que os dados são obtidos por meio de web scrapping e web crawling.

Owner
Bruno Aricó
Comp. Scientist and enthusiast in Hardware and Aviation
Bruno Aricó
A Pythonic Data Catalog powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your big data workloads.

DeltaCAT DeltaCAT is a Pythonic Data Catalog powered by Ray. Its data storage model allows you to define and manage fast, scalable, ACID-compliant dat

45 Oct 15, 2022
Collie is for uncovering RDMA NIC performance anomalies

Collie is for uncovering RDMA NIC performance anomalies. Overview Prerequ

Bytedance Inc. 34 Dec 11, 2022
🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.

Best-of Python 🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly. This curated list contains 230 awesome open-source pr

Machine Learning Tooling 2.7k Jan 03, 2023
Heads Down Application for Mac OSX

Heads Down A Mac app that lives in your ribbon—with a click of the mouse, temporarily block distracting websites and applications to encourage "heads

20 Mar 10, 2021
A python script to run any executable and pass test cases to it's stdin and compare stdout with correct output.

quera_testcase_checker A python script to run any executable and pass test cases to it's stdin and compare stdout with correct output. proper way to u

k3y1 1 Nov 15, 2021
Created a Python Keylogger script.

Python Script Simple Keylogger Script WHAT IS IT? Created a Python Keylogger script. HOW IT WORKS Once the script has been executed, it will automatic

AC 0 Dec 12, 2021
Uma versão em Python/Ursina do aplicativo Real Drum (android).

Real Drum Descrição Esta é uma versão alternativa feita em Python com a engine Ursina do aplicatio Real Drum (presente no Google Play Store). Como exe

hayukimori 5 Aug 20, 2022
Anti VirusTotal written in Python.

How it works Most of the anti-viruses on VirusToal uses sandboxes or vms to scan and detect malicious activity. The code checks to see if the devices

cliphd 3 Dec 26, 2021
This repo created to complete the task HACKTOBER 2021, contribute now and get your special T-Shirt & Sticker. TO SUPPORT OWNER PLEASE PRESS STAR BUTTON

❤ THIS REPO WILL CLOSED IN 31 OCT 00:00 ❤ This repository will automatically assign the hacktoberfest and hacktoberfest-accepted labels to all submitt

Rajendra Rakha 307 Dec 27, 2022
A Python library for inspecting JVM class files (.class)

lawu Lawu is a human-friendly library for assembling, disassembling, and exploring JVM class files. It's highly suitable for automation tasks. Documen

Tyler Kennedy 45 Oct 23, 2022
Module-based cryptographic tool

Cryptosploit A decryption/decoding/cracking tool using various modules. To use it, you need to have basic knowledge of cryptography. Table of Contents

/SNESE_AR\ 33 Nov 27, 2022
Python AVL Protocols Server for Codec 8 and Codec 8 Extended Protocols

pycodecs Package provides python AVL Protocols Server for Codec 8 and Codec 8 Extended Protocols This package will parse the AVL Data and log it in hu

Vardharajulu K N 2 Jun 21, 2022
GDIT: Geometry Dash Info Tool

GDIT: Geometry Dash Info Tool This is the first large script that allows you to quickly get information from the Geometry Dash server

dezz0xY 2 Jan 09, 2022
Projeto para ajudar no aprendizado da linguagem Pyhon

Economize Este projeto tem o intuito de criar desáfios para a codificação em Python, fazendo com que haja um maior entendimento da linguagem em seu to

Lucas Cunha Rodrigues 1 Dec 16, 2021
This is a pretty basic but relatively nice looking Python Pomodoro Timer.

Python Pomodoro-Timer This is a pretty basic but relatively nice looking Pomodoro Timer. Currently its set to a very basic mode, but the funcationalit

EmmHarris 2 Oct 18, 2021
Feature engineering library that helps you keep track of feature dependencies, documentation and schema

Feature engineering library that helps you keep track of feature dependencies, documentation and schema

28 May 31, 2022
Automation of VASP DFT workflows with ASE - application scripts

This repo contains a library that aims at automatizing some Density Functional Theory (DFT) workflows in VASP by using the ASE toolkit.

Frank Niessen 5 Sep 06, 2022
Reference python implementation of Chia pool operations for pool operators

This repository provides a sample server written in python, which is meant to server as a basis for a Chia Pool. While this is a fully functional implementation, it requires some work in scalability

Chia Network 451 Dec 13, 2022
A framework that let's you compose websites in Python with ease!

Perry Perry = A framework that let's you compose websites in Python with ease! Perry works similar to Qt and Flutter, allowing you to create componen

Linkus 13 Oct 09, 2022
Scraping comments from the political section of popular Nigerian blog (Nairaland), and saving in a CSV file.

Scraping_Nairaland This project scraped comments from the political section of popular Nigerian blog www.nairaland.com using the Python BeautifulSoup

Ansel Orhero 1 Nov 14, 2021