Machine Learning toolbox for Humans

Related tags

Deep Learningrep
Overview

Reproducible Experiment Platform (REP)

Join the chat at https://gitter.im/yandex/rep Build Status PyPI version Documentation CircleCI

REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way.

Main features:

  • unified python wrapper for different ML libraries (wrappers follow extended scikit-learn interface)
    • Sklearn
    • TMVA
    • XGBoost
    • uBoost
    • Theanets
    • Pybrain
    • Neurolab
    • MatrixNet service(available to CERN)
  • parallel training of classifiers on cluster
  • classification/regression reports with plots
  • interactive plots supported
  • smart grid-search algorithms with parallel execution
  • research versioning using git
  • pluggable quality metrics for classification
  • meta-algorithm design (aka 'rep-lego')

REP is not trying to substitute scikit-learn, but extends it and provides better user experience.

Howto examples

To get started, look at the notebooks in /howto/

Notebooks can be viewed (not executed) online at nbviewer
There are basic introductory notebooks (about python, IPython) and more advanced ones (about the REP itself)

Examples code is written in python 2, but library is python 2 and python 3 compatible.

Installation with Docker

We provide the docker image with REP and all it's dependencies. It is a recommended way, specially if you're not experienced in python.

Installation with bare hands

However, if you want to install REP and all of its dependencies on your machine yourself, follow this manual: installing manually and running manually.

Links

License

Apache 2.0, library is open-source.

Minimal examples

REP wrappers are sklearn compatible:

from rep.estimators import XGBoostClassifier, SklearnClassifier, TheanetsClassifier
clf = XGBoostClassifier(n_estimators=300, eta=0.1).fit(trainX, trainY)
probabilities = clf.predict_proba(testX)

Beloved trick of kagglers is to run bagging over complex algorithms. This is how it is done in REP:

from sklearn.ensemble import BaggingClassifier
clf = BaggingClassifier(base_estimator=XGBoostClassifier(), n_estimators=10)
# wrapping sklearn to REP wrapper
clf = SklearnClassifier(clf)

Another useful trick is to use folding instead of splitting data into train/test. This is specially useful when you're using some kind of complex stacking

from rep.metaml import FoldingClassifier
clf = FoldingClassifier(TheanetsClassifier(), n_folds=3)
probabilities = clf.fit(X, y).predict_proba(X)

In example above all data are splitted into 3 folds, and each fold is predicted by classifier which was trained on other 2 folds.

Also REP classifiers provide report:

report = clf.test_on(testX, testY)
report.roc().plot() # plot ROC curve
from rep.report.metrics import RocAuc
# learning curves are useful when training GBDT!
report.learning_curve(RocAuc(), steps=10)  

You can read about other REP tools (like smart distributed grid search, folding and factory) in documentation and howto examples.

Comments
  • Problem with TMVAClassifier

    Problem with TMVAClassifier

    After REP installation from here, I've met the following problem with TMVAClassifier fitting: I'm trying to train TMVAClassifier, and IOError raises after following strings: " baseline = TMVAClassifier(method='kBDT', features=variables, BoostType='Grad', NTrees=40, Shrinkage=0.01, MaxDepth=7, UseNvars=6, nCuts=-1) features=variables)

    baseline.fit(train, train['signal'])"

    Stacktrace is next: IOError Traceback (most recent call last) in () 3 UseNvars=6, nCuts=-1) 4 # baseline = TMVAClassifier(method='kBDT', NTrees=50, Shrinkage=0.05, features=variables) ----> 5 baseline.fit(train, train['signal'])

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in fit(self, X, y, sample_weight) 288 self.factory_options = '{}:AnalysisType=Multiclass'.format(self.factory_options) 289 --> 290 return self._fit(X, y, sample_weight=sample_weight) 291 292 def predict_proba(self, X):

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in _fit(self, X, y, sample_weight, model_type) 104 add_info = _AdditionalInformation(directory, model_type=model_type) 105 try: --> 106 self._run_tmva_training(add_info, X, y, sample_weight) 107 finally: 108 self._remove_tmp_directory(directory)

    /usr/local/lib/python2.7/dist-packages/rep-0.6.3-py2.7.egg/rep/estimators/tmva.pyc in run_tmva_training(self, info, X, y, sample_weight) 134 xml_filename = os.path.join(info.directory, 'weights', 135 '{job}{name}.weights.xml'.format(job=info.tmva_job, name=self._method_name)) --> 136 with open(xml_filename, 'r') as xml_file: 137 self.formula_xml = xml_file.read() 138

    IOError: [Errno 2] No such file or directory: '/home/artem/Documents/IPython Notebooks/CERN + Yandex/Original Baseline/flavours-of-physics-start/tmp0Fhtqe/weights/TMVAEstimation_REP_Estimator.weights.xml'

    As I found, weights/ folder was created outside of temporary folder instead created inside in last one. It causes the error above.

    ROOT 5.34, Python 2.7, GCC 4.8, Ubuntu 14.04 LTS (x64). All requirenments for REP were installed successfully (from requirenments.txt)

    bug 
    opened by HolyBayes 9
  • FoldingClassifier: KFold vs StratifiedKFold

    FoldingClassifier: KFold vs StratifiedKFold

    Hey,

    first of all a compliment: I really like your repo and I build a lot of code on it, it's so useful! About the FoldingClassifier: There was already a request to implement the StratifiedKFolding additionally to the "normal" KFolding. I would be very glad to see this but I'd even go a step further: why don't you completely replace the KFold with a StratifiedKFold?

    I think, from an ML point of view, it is always better (or, in best case, equally good) to use a stratified one. Using a normal KFolding only introduces different class-balances which (usually) result in "shifted" probabilities among the different classifier, whereas a stratified one does not and therefore makes each trained classifiers predictions "comparable".

    Or in other words: I cannot think of any case where you want to have a non-stratified KFolding instead of a stratified one.

    What do you think?

    Best, Mayou

    enhancement 
    opened by jonas-eschle 5
  • Support for build on hosted on (ana)conda

    Support for build on hosted on (ana)conda

    I see that some of the continuous integration scripts support conda builds, although not all the dependencies are installed this way. Is there any hope of seeing a build on conda soon for Linux x86_64 systems?

    The reason I ask is that I have accounts on numerous batch systems, none of which I have root access or have any way to use docker. They're all linux-based though, as is the norm. So far as I know, this is the case for many researchers.

    It'd be great to see a way to quickly install REP on these systems. This would:

    • Cut down on the time needed to introduce people to REP
    • Hook into the environment management and environment logging provided by conda
    • Easily and quickly deploy REP on supercomputing nodes while requiring little of their filesystem

    This is especially useful for ensuring the ROOT install is sane. I know there has already been a lot of work in the direction of making REP easy to access and install. Perhaps this could be a healthy addition?

    question 
    opened by ewengillies 5
  • Add ability to initialise FoldingBase objects with external parser

    Add ability to initialise FoldingBase objects with external parser

    If you would like to run rep with eg a StratifiedKFold instead of a normal KFold, this will be possible after the pull request. If no external folder-object is parsed, the default KFold algorithm is used.

    opened by mschlupp 5
  • test_xgboost file is not running on windows 10

    test_xgboost file is not running on windows 10

    test_xgboost file is not running on windows 10 File "c:\Sander\my_code\rep-master\tests\test_xgboost.py", line 4, in from rep.estimators import XGBoostClassifier, XGBoostRegressor

    ImportError: cannot import name XGBoostClassifier

    when rep installatoin is ok but xgboost instal fails Microsoft Windows Version 10.0.10586 2015 Microsoft Corporation. All rights reserved.

    c:\Sander>pip install rep --no-dependencies Collecting rep Downloading rep-0.6.5.tar.gz (72kB) 100% |################################| 81kB 511kB/s Building wheels for collected packages: rep Running setup.py bdist_wheel for rep ... done Stored in directory: C:\Users\Sander\AppData\Local\pip\Cache\wheels\db\ee\06\ac6e3f3ec208edaee29654f0b55ffaf2719a51de799c396b91 Successfully built rep Installing collected packages: rep Successfully installed rep-0.6.5 You are using pip version 8.1.0, however version 8.1.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

    c:\Sander>pip install xgboost==0.4a30 Collecting xgboost==0.4a30 Downloading xgboost-0.4a30.tar.gz (753kB) 100% |################################| 757kB 553kB/s No files/directories in c:\users\sander\appdata\local\temp\pip-build-exobfm\xgboost\pip-egg-info (from PKG-INFO) You are using pip version 8.1.0, however version 8.1.2 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.

    c:\Sander>

    opened by Sandy4321 5
  • Manual Install on Windows

    Manual Install on Windows

    Hi! Is there a way to install REP manually on Windows environment? When installing dependencies i get an error when installing gnureadline:

    Error: this module is not meant to work on Windows (try pyreadline instead)

    Is there a way to use pyreadline for windows uoosers?

    wontfix 
    opened by funkindy 4
  • Mac OS instalation with docker

    Mac OS instalation with docker

    It seems last docker release depricates boot2docker http://docs.docker.com/installation/mac/ "This release of Docker deprecates the Boot2Docker command line in favor of Docker Machine"

    How to install REP with latest docker release?

    opened by pupadupa 4
  • test failed

    test failed

    after python setup.py install I run cd tests ; nosetests . it runs for long time and ends up with errors:

    ..Info in <TCanvas::Print>: png file /tmp/tmpBg1dar.png has been created
    Error in <TFile::TFile>: file toy_datasets/toyMC_bck_mass.root does not exist
    E..E.
    ======================================================================
    ERROR: tests.z_test_notebook.test_notebooks_in_folder('/root/rep/howto/00-intro-ROOT.ipynb',)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
        self.test(*self.arg)
      File "/root/rep/rep/test/test_notebooks.py", line 43, in check_single_notebook
        raise RuntimeError(description)
    RuntimeError: Cell failed: 'T.Draw("min_DOCA")
    c1'
    
     Traceback:
    ---------------------------------------------------------------------------
    ReferenceError                            Traceback (most recent call last)
    <ipython-input-5-aa6c7320180d> in <module>()
    ----> 1 T.Draw("min_DOCA")
          2 c1
    
    ReferenceError: attempt to access a null-pointer
    

    What am I missing?

    opened by anaderi 3
  • Updating numpy in 0.6.6 docker breaks matplotlib

    Updating numpy in 0.6.6 docker breaks matplotlib

    % docker run -ti yandex/rep:0.6.6 bash -lc 'pip install -U numpy; python -c "from matplotlib import pyplot as plt; plt.figure()"'
    Activate: ROOT has been sourced. Environment settings are ready.
    ROOTSYS=/root/miniconda/envs/rep_py2
    Deactivate:Unsetting ROOT environment variables..
    Activate: ROOT has been sourced. Environment settings are ready.
    ROOTSYS=/root/miniconda/envs/rep_py2
    Collecting numpy
      Downloading numpy-1.11.2-cp27-cp27mu-manylinux1_x86_64.whl (15.3MB)
        100% |################################| 15.3MB 46kB/s
    Installing collected packages: numpy
      Found existing installation: numpy 1.10.4
        DEPRECATION: Uninstalling a distutils installed project (numpy) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
        Uninstalling numpy-1.10.4:
          Successfully uninstalled numpy-1.10.4
    Successfully installed numpy-1.11.2
    /root/miniconda/envs/rep_py2/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
      warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
    bash: line 1:   222 Illegal instruction     python -c "from matplotlib import pyplot as plt; plt.figure()"
    
    opened by sashabaranov 2
  • do we need to measure fit/predict time without %time?

    do we need to measure fit/predict time without %time?

    it is useful if jupyter frontend disconnects during fit/predict execution.

    might the following snippet be handy for such cases

    class Stopwatch(object):
        def __enter__(self):
            self.t0 = datetime.datetime.now()
            return self
    
        def __exit__(self, type, value, traceback):
            self.t1 = datetime.datetime.now()
    
        def __repr__(self):
            return "delta: (%s)" % (self.t1 - self.t0)
    
    
    with Stopwatch() as sfit:
        time.sleep(1)
    with Stopwatch() as spredict:
        time.sleep(1)
    
    print "fit:", sfit, "spredict:", spredict
    
    opened by anaderi 2
  • New REP docker version running in /var/lib/docker/volumes/ instead of ~/rep_container

    New REP docker version running in /var/lib/docker/volumes/ instead of ~/rep_container

    Hi.

    I had old REP docker version in ~/rep_container which started with run.sh script on 8080 port. I updated REP and it broke: sudo $REPDIR/run.sh worked, but I couldn't connect to localhost:8080 (connection refused). I've decided to update docker and REP according to new instructions: https://github.com/yandex/rep/wiki/Install-REP-with-Docker-(Linux).

    1. I installed Docker, according to instructions.
    2. netstat -anl | grep 8888 gave empty result
    3. git checkout https://github.com/yandex/rep.git didn't work (pathspec did not match any file(s) known to git), so I used git clone instead.
    4. First run of sudo make run was successful and installed container.
    5. I rebooted and second sudo make run gave the following

    docker run -ti --rm -p 8888:8888 --name rep yandex/rep:0.6.4
    Error response from daemon: Conflict. The name "rep" is already in use by container 3af0884aeedb. You have to remove (or rename) that container to be able to reuse that name. make: *
    * [run] Error 1* 6. I ran sudo docker images

    REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE yandex/rep 0.6.4 18a48bc5a3b6 8 hours ago 2.635 GB anaderi/rep latest 63c3db2850b6 4 months ago 1.649 GB 91c95931e552 7 months ago 910 B 7. I tried sudo docker start rep. It worked and I opned REP on localhost:8888. But its working folder changed. Now it is /var/lib/docker/volumes/dbcc7ff99538007d9c6b244fb6b8f03bdcfd564f6076b36d79fa3330d2041107/_data/. It is quite unhandy, because it requires superuser rights to access and not conveniently located at all.

    Question: Is it a new system or did I something wrong? If latter, how to I fix it and run REP container in handy folder?

    opened by lodurality 2
  • Bump notebook from 4.2.1 to 6.4.12

    Bump notebook from 4.2.1 to 6.4.12

    Bumps notebook from 4.2.1 to 6.4.12.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Update lib

    Update lib

    Issue:

    ModuleNotFoundError Traceback (most recent call last) in 5 from sklearn.ensemble import HistGradientBoostingClassifier 6 from rep.report.metrics import RocAuc ----> 7 from rep.metaml import GridOptimalSearchCV, FoldingScorer, RandomParameterOptimizer 8 from rep.estimators import SklearnClassifier

    ~/.local/lib/python3.8/site-packages/rep/metaml/init.py in 2 3 from .factory import ClassifiersFactory, RegressorsFactory ----> 4 from .folding import FoldingClassifier, FoldingRegressor 5 from .gridsearch import GridOptimalSearchCV 6 from .stacking import FeatureSplitter

    ~/.local/lib/python3.8/site-packages/rep/metaml/folding.py in 11 12 from sklearn import clone ---> 13 from sklearn.cross_validation import KFold 14 from sklearn.utils import check_random_state 15 from . import utils

    ModuleNotFoundError: No module named 'sklearn.cross_validation'

    Correction suggested based on https://stackoverflow.com/questions/30667525/importerror-no-module-named-sklearn-cross-validation

    opened by RobsonRocha 1
  • Bump requests from 2.9.1 to 2.20.0

    Bump requests from 2.9.1 to 2.20.0

    Bumps requests from 2.9.1 to 2.20.0.

    Changelog

    Sourced from requests's changelog.

    2.20.0 (2018-10-18)

    Bugfixes

    • Content-Type header parsing is now case-insensitive (e.g. charset=utf8 v Charset=utf8).
    • Fixed exception leak where certain redirect urls would raise uncaught urllib3 exceptions.
    • Requests removes Authorization header from requests redirected from https to http on the same hostname. (CVE-2018-18074)
    • should_bypass_proxies now handles URIs without hostnames (e.g. files).

    Dependencies

    • Requests now supports urllib3 v1.24.

    Deprecations

    • Requests has officially stopped support for Python 2.6.

    2.19.1 (2018-06-14)

    Bugfixes

    • Fixed issue where status_codes.py's init function failed trying to append to a __doc__ value of None.

    2.19.0 (2018-06-12)

    Improvements

    • Warn user about possible slowdown when using cryptography version < 1.3.4
    • Check for invalid host in proxy URL, before forwarding request to adapter.
    • Fragments are now properly maintained across redirects. (RFC7231 7.1.2)
    • Removed use of cgi module to expedite library load time.
    • Added support for SHA-256 and SHA-512 digest auth algorithms.
    • Minor performance improvement to Request.content.
    • Migrate to using collections.abc for 3.7 compatibility.

    Bugfixes

    • Parsing empty Link headers with parse_header_links() no longer return one bogus entry.
    ... (truncated)
    Commits
    • bd84045 v2.20.0
    • 7fd9267 remove final remnants from 2.6
    • 6ae8a21 Add myself to AUTHORS
    • 89ab030 Use comprehensions whenever possible
    • 2c6a842 Merge pull request #4827 from webmaven/patch-1
    • 30be889 CVE URLs update: www sub-subdomain no longer valid
    • a6cd380 Merge pull request #4765 from requests/encapsulate_urllib3_exc
    • bbdbcc8 wrap url parsing exceptions from urllib3's PoolManager
    • ff0c325 Merge pull request #4805 from jdufresne/https
    • b0ad249 Prefer https:// for URLs throughout project
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot ignore this [patch|minor|major] version will close this PR and stop Dependabot creating any more for this minor/major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Changes to TMVA API in new ROOT versions break TMVAClassifier

    Changes to TMVA API in new ROOT versions break TMVAClassifier

    Hi all,

    first of all, I wanted to thank and compliment the developers for this brilliant library. I finally had the chance to start playing with it today, but I was stopped in my tracks when trying to use a TMVAClassifier:

    AssertionError: ERROR: TMVA process is incorrect finished 
     LOG: None 
     Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/ludo/miniconda3/envs/pyroot/lib/python2.7/site-packages/rep/estimators/_tmvaFactory.py", line 86, in main
        tmva_process(classifier, info, data, labels, sample_weight)
      File "/home/ludo/miniconda3/envs/pyroot/lib/python2.7/site-packages/rep/estimators/_tmvaFactory.py", line 40, in tmva_process
        factory.AddVariable(var)
    AttributeError: 'Factory' object has no attribute 'AddVariable'
    

    My ROOT/TMVA versions are:

    You are running ROOT Version: 6.08/00, Nov 4, 2016
    TMVA Version 4.2.1, Feb 5, 2015
    

    Searching the web for this error message led me to this post on the ROOT forum: https://root-forum.cern.ch/t/25090, where the cause of problem is indicated as being due to a breaking change in the TMVA API:

    In recent ROOT versions (6.06 or 6.08, don't remember exactly), the TMVA interface has changed. You need to create a TMVA::DataLoader and call AddVariable on the dataloader object.

    As I understand, this is related to what was mentioned by @gandreassi in a comment to #104. Any idea on how complicated it would be to adapt tmva_process to the new interface?

    opened by fndari 1
Releases(0.6.6)
  • 0.6.6(Aug 9, 2016)

    • python2 and python3 dockers
    • updated libraries
    • added CacheClassifier
    • minimized size of docker image, simplified building process
    • some fixes for ML libraries
    • some documentation updates
    • deleted plot.ly
    • solved theanets reproducibility
    Source code(tar.gz)
    Source code(zip)
  • 0.6.5(Feb 3, 2016)

    Fixes:

    • TMVA process correct termination
    • TMVA fix for MAX OS El Capitan (problems with dynamic libraries paths)
    • fix travis (show not passed tests, create docker on dockerhub)
    • fix wget in notebooks
    • fix errors calculation in efficiencies (for flatness property)
    • added Makefile
    • fix normalization in the multi dimentional metric
    Source code(tar.gz)
    Source code(zip)
  • 0.6.4(Nov 21, 2015)

    • Add continuous integration
    • Python 3 support
    • Conda installation in docker and travis
    • Kitematic-friendly docker
    • Update all libraries versions
    • added Folding Regressor, added feature importances for folding
    • added minimization to gridsearch, added random gridsearch from distributions
    • added folding scorer for regressor to gridsearch
    • faster tests
    • updated notebooks
    • Fixes:
      • tmva termination
      • documentation for grid search
      • Gridsearch bugs with metrics (metric fit)
      • learning curve with mask for folding
    Source code(tar.gz)
    Source code(zip)
  • 0.6.3(Jul 30, 2015)

  • 0.6.2(Jul 6, 2015)

    • Support of neural networks in common interface:

      • theanets
      • neurolab
      • pybrain

      Now all the REP stuff is available for classifiers and regressors from these libraries:

      • usage inside sklearn pipeline
      • grid_search for hyper parameter optimization
      • reports, parallel training on cluster
    • New lovely documentation, check it out!

    • Fixes in metaclassifiers connected with usage of expressions-as-features

    • Rewritten FeatureSplitter

    • Switched to sklearn 0.16

    • New method train_test_split_group - splitting into train and test by the value of special column. Samples with same values are either both in train or both in test.

    • Update howto/notebooks with new open physical datasets

    Source code(tar.gz)
    Source code(zip)
  • 0.6.1(May 22, 2015)

    • Tmva implementation enhancement with root_numpy https://github.com/yandex/rep/issues/2.
    • Add FPRatTPR (return fpr value at fixed tpr) and TPRatFPR (return tpr value at fixed fpr) metrics, which are required, e.g. for tuning online triggering system. Moreover learning curves are available for these metrics now.
    • Many improvements in documentation.
    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(May 12, 2015)

    • unified classifiers wrapper for variety of implementations: TMVA, Sklearn, XGBoost, uBoost
    • parallel training of classifiers on cluster
    • classification/regression reports with plots
    • support of interactive plots (bokeh, plotly)
    • grid-search with parallelized execution on a cluster
    • git, versioning of research
    • computation of different classification metrics
    • partial support of python 3.
    Source code(tar.gz)
    Source code(zip)
Owner
Yandex
Yandex open source projects and technologies
Yandex
[제 13회 투빅스 컨퍼런스] OK Mugle! - 장르부터 멜로디까지, Content-based Music Recommendation

Ok Mugle! 🎵 장르부터 멜로디까지, Content-based Music Recommendation 'Ok Mugle!'은 제13회 투빅스 컨퍼런스(2022.01.15)에서 진행한 음악 추천 프로젝트입니다. Description 📖 본 프로젝트에서는 Kakao

SeongBeomLEE 5 Oct 09, 2022
A Moonraker plug-in for real-time compensation of frame thermal expansion

Frame Expansion Compensation A Moonraker plug-in for real-time compensation of frame thermal expansion. Installation Credit to protoloft, from whom I

58 Jan 02, 2023
This repo is customed for VisDrone.

Object Detection for VisDrone(无人机航拍图像目标检测) My environment 1、Windows10 (Linux available) 2、tensorflow = 1.12.0 3、python3.6 (anaconda) 4、cv2 5、ensemble

53 Jul 17, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability an

Liang HUANG 87 Dec 28, 2022
Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

Council-GAN Implementation of our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020) Paper Ori Nizan , Ayellet Tal, Breaking the Cycle

ori nizan 260 Nov 16, 2022
This repository provides an efficient PyTorch-based library for training deep models.

s3sec Test AWS S3 buckets for read/write/delete access This tool was developed to quickly test a list of s3 buckets for public read, write and delete

Bytedance Inc. 123 Jan 05, 2023
Computing Shapley values using VAEAC

Shapley values and the VAEAC method In this GitHub repository, we present the implementation of the VAEAC approach from our paper "Using Shapley Value

3 Nov 23, 2022
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Sta

Charles R. Qi 4k Dec 30, 2022
A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

RSG: A Simple but Effective Module for Learning Imbalanced Datasets (CVPR 2021) A Pytorch implementation of our CVPR 2021 paper "RSG: A Simple but Eff

120 Dec 12, 2022
This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

AdapterHub 18 Dec 09, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
[ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation

RTD-Net (ICCV 2021) This repo holds the codes of paper: "Relaxed Transformer Decoders for Direct Action Proposal Generation", accepted in ICCV 2021. N

Multimedia Computing Group, Nanjing University 80 Nov 30, 2022
This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Lite-HRNet: A Lightweight High-Resolution Network Introduction This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution

HRNet 675 Dec 25, 2022
Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Triple-cooperative Video Shadow Detection Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official l

Zhihao Chen 24 Oct 04, 2022
Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

Image Classification Project Killer in PyTorch This repo is designed for those who want to start their experiments two days before the deadline and ki

349 Dec 08, 2022
Stacs-ci - A set of modules to enable integration of STACS with commonly used CI / CD systems

Static Token And Credential Scanner CI Integrations What is it? STACS is a YARA

STACS 18 Aug 04, 2022
Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations

Information-Theoretic Multi-Objective Bayesian Optimization with Continuous Approximations Requirements The code is implemented in Python and requires

1 Nov 03, 2021
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
A novel pipeline framework for multi-hop complex KGQA task. About the paper title: Improving Multi-hop Embedded Knowledge Graph Question Answering by Introducing Relational Chain Reasoning

Rce-KGQA A novel pipeline framework for multi-hop complex KGQA task. This framework mainly contains two modules, answering_filtering_module and relati

金伟强 -上海大学人工智能小渣渣~ 16 Nov 18, 2022
Real-time multi-object tracker using YOLO v5 and deep sort

This repository contains a two-stage-tracker. The detections generated by YOLOv5, a family of object detection architectures and models pretrained on the COCO dataset, are passed to a Deep Sort algor

Mike 3.6k Jan 05, 2023