DPS Lab
Professional Data and Text Analysis Tools
DPS Lab
State-of-the-Art data and text analysis tools
Home Tools & Services DPS Lab
About DPS Lab
Own customized pipelines creation
Software is designed for creation of large text and data processing tasks composed from data imputation, preprocessing, processing and interpretation of outputs. You can use the results sharing feature for streamlining a team communication.
Unique combination of objects
Text datasets, Mongo DB, Postgres DB, External API, YouTube, Scrapers, Tickers, Factor groups, Models, Parquet DS, NLP tasks, Pre-Processing tasks, Data quality check, Financial features, Models training, Summarization, Sentiment, Charts.
Versatility
The modules are prepared for the input of data from manifold areas, such as biology or meteorology and others characterized by stochastic behavior. The application is ready to integrate the several type of data and bring a solution in the desired domain.
Own customized pipelines creation
Software is designed for creation of large text and data processing tasks composed from data imputation, preprocessing, processing and interpretation of outputs. You can use the results sharing feature for streamlining a team communication.
Unique combination of objects
Text datasets, Mongo DB, Postgres DB, External API, YouTube, Scrapers, Tickers, Factor groups, Models, Parquet DS, NLP tasks, Pre-Processing tasks, Data quality check, Financial features, Models training, Summarization, Sentiment, Charts.
Versatility
The modules are prepared for the input of data from manifold areas, such as biology or meteorology and others characterized by stochastic behavior. The application is ready to integrate the several type of data and bring a solution in the desired domain.
How does DPS Lab work?
Inputs
The input object is files of functions for importing data of various formats.
The platform enables connection to public and private sector information systems, as well as applications in which data is published with a regular period and repetition.
The range of data used ranges from open public sector data (US Statements/EDI), to data downloaded by web crawlers (for example Yahoo Finance or Seeking Alpha), to paid APIs (for example RapidAPI), to private data that users upload to the platform themselves.
Text datasets
Node for textual dataset selection.
Mongo DB
Node for loading data from external Mongo database.
Postgres DB
Node for loading data from external PostgreSQL database.
External API
Node for loading data from external API.
YouTube
Node for loading data from Youtube videos.
Scraper
Node for loading data from Scraper.
Tickers
Node for Ticker selection.
Factor groups
Node for factor group selection.
Models
Node for model selection.
Parquet DS
Node for parquet dataset selection.
Preprocessing and Processing
The data collected by the platform is available to the data preprocessing and processing subject. The platform provides new data processing functions, such as summarization tools and sentiment identification tools.
The development of these functions used the latest knowledge in the fields of machine learning, deep learning, the development of artificial intelligence (AI), in general, the use of cognitive technologies.
Tokenize
Tokenize string input into array output. Posibility choose czech or english language.
Remove stop words
Basis stopwords. Posibility choose czech or english language.
Check data quality
The input can be a ticker (stock) or parquet file.
Lemmatize
Lemmatize text. Posibility choose czech or english language.
POS Tagger
Part-of-speech recognition. Posibility choose czech or english language.
Financial features
The input is a ticker (stock) and the selected factor group. According to the group, the performance factors for the selected ticker are calculated.
Drop columns
Drop selected columns.
Rename columns
Rename selected columns.
Train model
The input is the parguet file and the selected model.
Pre-Processing
Most common preprocessing techniques. Remove numbers, emoticons, accents, etc.
Summarize
Summarize given text.
Sentiment
Get sentiment for given text.
Results displaying and saving
The data processing object is followed by the data storage object (data repository).
Data can be exported from the platform into machine-readable formats, which enable further editing and data import into other systems. Communication with third-party systems or applications is ensured by both the export option and API output, which enables automated data transfer for cases of automated communication with other systems.
Console
Write data to the developer console in the browser and to the Log in the bottom right.
Data quality report
Clearly display report about data quality control.
Chart
Display data to chart.
Video Guides
Tutorial for using the DPS Result
Sharing application
The application is used to share results and work on joint team projects in the Analytical Platform. In the video, we will see how the DPS Result Sharing module works.
Demonstration of a typical NLP
task related to data preprocessing in the Analytical Platform DPS Lab
In the video, we will show how the DPS Lab application allows you to work with data within the Preprocessing and objects on a typical NLP task.
Demonstration of data imputation in the Analytical Platform DPS Lab
In this video you will find a guide on how to input data in the DPS Lab application.
Demonstration of model training in the Analytical Platform DPS Lab
In the video, we will show how we can use the DPS Lab application for model training.
As an output we will get the Score value, which tells us how many percent of deterministic movement (i.e. non-random) the model explains, and also the Coef value indicating the parameters of the model, the so-called beta coefficients, which weights the value of the factor and thereby directly co-creates the prediction.