pubmex.py - a script to get a fancy paper title based on given DOI or PMID

Overview

Pu(b)mex

tag PyPI version

pubmex.py is a script to get a fancy paper title based on given DOI or PMID (can be also combined with macOS Finder)

Format of the title:

a first author . a last author - (title("dotted") or your customed title) . PMID . journal . year . pdf
e.g.
  Kelley.Scott.The.evolution.biology.shift.towards.engineering.prediction-generating.tools.away.traditional.research.practice.EMBORep.2008.pdf

Nowadays, it’s not a big issue, with all Mendeley and other tools, however...

I don’t want to put any PDF file collected on the way into my library, because then it gets super big (and then it’s hard to sync it for example with Dropbox). So now I can keep these PDF files into pdf-icebox and re-name them niecely automatically:

$ ls
Hnisz.Sharp.Phase.Separation.Model.Transcriptional.Control.Cell.2017.pdf
Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf

Usage:

./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf ">
$ pubmex.py sharp2017.pdf
Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf
mv sharp2017.pdf --> ./Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf

$ pubmex.py Query.Konarska.pdf
mv Query.Konarska.pdf --> ./Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf
    
$ pubmex.py eabc9191.full.pdf
mv  eabc9191.full.pdf --> ./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf

DEPENDENCIES

INSTALLATION

pip install pubmex
# Ubuntu (Debian-based system)
apt-get install xclip python-biopython pdftotext
# macOS
brew install poppler biopython # or "sudo port install poppler biopython"

HISTORY

  • 1.4 Add osx-automator
  • 1.3 Fixed #4 #5
  • 1.2 Fixed #2
  • 1.1 Simplify input, pubmex.py *.pdf
  • 1.0 With recent bugfixes 2021
  • 0.3 OSX installation
  • 0.2 Small changes
  • 0.1 Init version in 2010! :-)
Comments
  • Automator not working

    Automator not working

    It seems that when using the automator installations that come with the pubmex the pubmex.py can not be found.

        for f in "$@"
        do
            pubmex.py $f
        done
    

    The following error is displayed:

    The action “Run Shell Script” encountered an error: “zsh:3: command not found: pubmex.py”

    When specifying the direct location of just the pubmex.py file another error occures.

        for f in "$@"
        do
            /users/suntim/miniforge3/bin/pubmex.py $f
        done
    

    The following error is displayed:

    The action “Run Shell Script” encountered an error: “”

    When specifying the direct location of python and the pubmex.py file another error occures.

        for f in "$@"
        do
            /usr/local/bin/python3 /users/suntim/miniforge3/bin/pubmex.py $f
        done
    

    The following error is displayed:

    The action “Run Shell Script” encountered an error: “Traceback (most recent call last): File "/users/suntim/miniforge3/bin/pubmex.py", line 27, in <module> from Bio import Entrez ModuleNotFoundError: No module named 'Bio'”

    I have all dependencies installed pip3 install pubmex, pip3 install biopython, brew install poppler. As it says in the readme.md that biopython should be isntalled via brew I assume that was a mistake. I instead installed it via pip3.

    The same error messages occure regardless of using the zsh or bash version.

    opened by LinusKaiser 2
  • Not found in PubMed, although DOI (.ORG/10.1016/J.BBAGRM.2015.08.009) was detected

    Not found in PubMed, although DOI (.ORG/10.1016/J.BBAGRM.2015.08.009) was detected

    [email protected]:~/Desktop/pdfs$ pubmex.py -a -r -f 1-s2.0-S1874939915001868-main.pdf ERROR: Not found in PubMed, although DOI (.ORG/10.1016/J.BBAGRM.2015.08.009) was detected in the pdf! Traceback (most recent call last): File "/home/magnus/bin/pubmex.py", line 472, in main() File "/home/magnus/bin/pubmex.py", line 451, in main title = get_title_auto_from_text(text, OPTIONS.debug, False, OPTIONS.keywords) File "/home/magnus/bin/pubmex.py", line 239, in get_title_auto_from_text return get_title_via_doi(doi, debug, reference, customed_title) File "/home/magnus/bin/pubmex.py", line 359, in get_title_via_doi pmid = get_pmid_via_doi_net(doi) File "/home/magnus/bin/pubmex.py", line 333, in get_pmid_via_doi_net return get_value('citation_pmid', content) TypeError: get_value() takes exactly 3 arguments (2 given)

    opened by mmagnus 2
  • Invalid git clone (edit: on windows machines)

    Invalid git clone (edit: on windows machines)

    The colon in 'demo/10.1261:rna.418407.pdf' causes problems in cloning from windows machines.

    Cloning into 'pubmex'... remote: Enumerating objects: 426, done. remote: Counting objects: 100% (9/9), done. remote: Total 426 (delta 8), reused 8 (delta 8), pack-reused 417 eceiving obj Receiving objects: 100% (426/426), 3.79 MiB | 2.86 MiB/s, done. Resolving deltas: 100% (252/252), done. error: invalid path 'demo/10.1261:rna.418407.pdf' fatal: unable to checkout working tree warning: Clone succeeded, but checkout failed. You can inspect what was checked out with 'git status' and retry with 'git restore --source=HEAD :/'

    opened by gcasale 1
  • ct200162x.pdf

    ct200162x.pdf

    (py37) [mx] rna$ pubmex.py ct200162x.pdf --debug
    filename: .......... ct200162x.pdf
    filename: .......... ct200162x.pdf
    doi: ............... ct200162x
    IdList.............. []
    pmid: .............. False
    ERROR: 		Not found in PubMed, although DOI (ct200162x) was detected in the pdf!
    generate ./temp.....[OK]
    out:
    err:
    temp is going to be opened
    doi_line: .......... DX.DOI.ORG/10.1021/CT200162X | J. CHEM. THEORY COMPUT. 2011, 7, 28862902
    doi is found: ...... 10.1021/CT200162X
    doi: ............... 10.1021/CT200162X
    IdList.............. ['21921995']
    pmid: .............. 21921995
    summary_dict........ {'Item': [], 'Id': '21921995', 'PubDate': '2011 Sep 13', 'EPubDate': '2011 Aug 2', 'Source': 'J Chem Theory Comput', 'AuthorList': ['Zgarbová M', 'Otyepka M', 'Sponer J', 'Mládek A', 'Banáš P', 'Cheatham TE 3rd', 'Jurečka P'], 'LastAuthor': 'Jurečka P', 'Title': 'Refinement of the Cornell et al. Nucleic Acids Force Field Based on Reference Quantum Chemical Calculations of Glycosidic Torsion Profiles.', 'Volume': '7', 'Issue': '9', 'Pages': '2886-2902', 'LangList': ['English'], 'NlmUniqueID': '101232704', 'ISSN': '1549-9618', 'ESSN': '1549-9626', 'PubTypeList': ['Journal Article'], 'RecordStatus': 'PubMed', 'PubStatus': 'ppublish+epublish', 'ArticleIds': {'pubmed': ['21921995'], 'medline': [], 'doi': '10.1021/ct200162x', 'pmc': 'PMC3171997', 'rid': '21921995', 'eid': '21921995', 'pmcid': 'pmc-id: PMC3171997;'}, 'DOI': '10.1021/ct200162x', 'History': {'pubmed': ['2011/09/17 06:00'], 'medline': ['2011/09/17 06:01'], 'received': '2011/03/08 00:00', 'entrez': '2011/09/17 06:00'}, 'References': [], 'HasAbstract': IntegerElement(1, attributes={}), 'PmcRefCount': IntegerElement(242, attributes={}), 'FullJournalName': 'Journal of chemical theory and computation', 'ELocationID': '', 'SO': '2011 Sep 13;7(9):2886-2902'}
    ERROR: 		Problem! The pubmex could not find automatically a title for the pdf file! Sorry!
    
    opened by mmagnus 0
  • gkz1184.pdf

    gkz1184.pdf

    (py37) [mx] rna$ pubmex.py gkz1184.pdf --debug
    filename: .......... gkz1184.pdf
    filename: .......... gkz1184.pdf
    doi: ............... gkz1184
    IdList.............. []
    pmid: .............. False
    ERROR: 		Not found in PubMed, although DOI (gkz1184) was detected in the pdf!
    generate ./temp.....[OK]
    out:
    err:
    temp is going to be opened
    doi_line: .......... 11641174 NUCLEIC ACIDS RESEARCH, 2020, VOL. 48, NO. 3 DOI: 10.1093/NAR/GKZ1184
    doi is found: ...... 10.1093/NAR/GKZ1184
    doi: ............... 10.1093/NAR/GKZ1184
    IdList.............. ['31889193']
    pmid: .............. 31889193
    summary_dict........ {'Item': [], 'Id': '31889193', 'PubDate': '2020 Feb 20', 'EPubDate': '', 'Source': 'Nucleic Acids Res', 'AuthorList': ['Reißer S', 'Zucchelli S', 'Gustincich S', 'Bussi G'], 'LastAuthor': 'Bussi G', 'Title': 'Conformational ensembles of an RNA hairpin using molecular dynamics and sparse NMR data.', 'Volume': '48', 'Issue': '3', 'Pages': '1164-1174', 'LangList': ['English'], 'NlmUniqueID': '0411011', 'ISSN': '0305-1048', 'ESSN': '1362-4962', 'PubTypeList': ['Journal Article'], 'RecordStatus': 'PubMed - indexed for MEDLINE', 'PubStatus': 'ppublish', 'ArticleIds': {'pubmed': ['31889193'], 'medline': [], 'pii': '5691221', 'doi': '10.1093/nar/gkz1184', 'pmc': 'PMC7026608', 'rid': '31889193', 'eid': '31889193', 'pmcid': 'pmc-id: PMC7026608;'}, 'DOI': '10.1093/nar/gkz1184', 'History': {'pubmed': ['2020/01/01 06:00'], 'medline': ['2020/03/20 06:00'], 'accepted': '2019/12/09 00:00', 'revised': '2019/12/05 00:00', 'received': '2019/10/14 00:00', 'entrez': '2020/01/01 06:00'}, 'References': [], 'HasAbstract': IntegerElement(1, attributes={}), 'PmcRefCount': IntegerElement(3, attributes={}), 'FullJournalName': 'Nucleic acids research', 'ELocationID': 'doi: 10.1093/nar/gkz1184', 'SO': '2020 Feb 20;48(3):1164-1174'}
    ERROR: 		Problem! The pubmex could not find automatically a title for the pdf file! Sorry!
    
    opened by mmagnus 0
  • some problem when I removed some prints to make the script quite

    some problem when I removed some prints to make the script quite

    (py37) [mx] d$ pubmex -p 10.1016/j.molcel.2020.11.004
    (py37) [mx] d$ pubmex -p 10.1016/j.molcel.2020.11.004 -d
    doi: ............... 10.1016/j.molcel.2020.11.004
    IdList.............. ['33259809']
    pmid: .............. 33259809
    summary_dict........ {'Item': [], 'Id': '33259809', 'PubDate': '2020 Dec 17', 'EPubDate': '2020 Nov 5', 'Source': 'Mol Cell', 'AuthorList': ['Ziv O', 'Price J', 'Shalamova L', 'Kamenova T', 'Goodfellow I', 'Weber F', 'Miska EA'], 'LastAuthor': 'Miska EA', 'Title': 'The Short- and Long-Range RNA-RNA Interactome of SARS-CoV-2.', 'Volume': '80', 'Issue': '6', 'Pages': '1067-1077.e5', 'LangList': ['English'], 'NlmUniqueID': '9802571', 'ISSN': '1097-2765', 'ESSN': '1097-4164', 'PubTypeList': ['Journal Article'], 'RecordStatus': 'PubMed - indexed for MEDLINE', 'PubStatus': 'ppublish+epublish', 'ArticleIds': {'pubmed': ['33259809'], 'medline': [], 'pii': 'S1097-2765(20)30782-6', 'doi': '10.1016/j.molcel.2020.11.004', 'pmc': 'PMC7643667', 'rid': '33259809', 'eid': '33259809', 'pmcid': 'pmc-id: PMC7643667;'}, 'DOI': '10.1016/j.molcel.2020.11.004', 'History': {'pubmed': ['2020/12/02 06:00'], 'medline': ['2021/01/12 06:00'], 'received': '2020/07/20 00:00', 'revised': '2020/10/05 00:00', 'accepted': '2020/10/29 00:00', 'entrez': '2020/12/01 20:08'}, 'References': [], 'HasAbstract': IntegerElement(1, attributes={}), 'PmcRefCount': IntegerElement(10, attributes={}), 'FullJournalName': 'Molecular cell', 'ELocationID': 'doi: 10.1016/j.molcel.2020.11.004', 'SO': '2020 Dec 17;80(6):1067-1077.e5'}
    Ziv.Miska.The.Short-Long-Range.RNA-RNA.Interactome.SARS-CoV-2.MolCell.2020.pdf
    
    bug 
    opened by mmagnus 0
Releases(1.4.2)
  • 1.4.2(Mar 15, 2022)

    Now you can see in Finder QuickAction pubmex to quick run it on a number of PDFs files.

    Install pubmex_zsh.workflow from pubmex/osx-automator/ for if you default shell is zsh, or pubmex_bash.workflow for bash.

    158028806-039d4ec6-caf5-446e-bcb0-835face858ee

    Source code(tar.gz)
    Source code(zip)
  • 1.4.1(Mar 12, 2022)

  • 1.4(Sep 27, 2021)

    Now you can see in Finder QuickAction pubmex to quick run it on a number of PDFs files.

    Install pubmex_zsh.workflow from pubmex/osx-automator/ for if you default shell is zsh, or pubmex_bash.workflow for bash.

    pubmex-osx-automator

    Source code(tar.gz)
    Source code(zip)
  • 1.3(Sep 26, 2021)

  • 1.2(Sep 14, 2021)

  • 1.1(Aug 18, 2021)

    Simplify input to pubmex.py *.pdf. Fixed #2

    Now, usage:

    $ pubmex.py sharp2017.pdf
    mv  sharp2017.pdf --> ./Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf
    
    $ pubmex.py  Query.Konarska.pdf
    mv  Query.Konarska.pdf --> Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf
    
    $ pubmex.py eabc9191.full.pdf
    mv  eabc9191.full.pdf --> ./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf
    
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Jun 23, 2021)

    I don’t want to put any PDF file collected on the way into my library, because then it gets super big (and then it’s hard to sync it for example with Dropbox). So now I can keep these PDF files into pdf-icebox and re-name them niecely automatically:

    Usage:

    $ pubmex.py -a -f sharp2017.pdf -r
    mv  sharp2017.pdf --> ./Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf
    
    $ pubmex.py -a -f Query.Konarska.pdf -r
    mv  Query.Konarska.pdf --> Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf
    
    $ pubmex.py -a -f eabc9191.full.pdf -r
    mv  eabc9191.full.pdf --> ./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf
    

    .. and we get a file:

    Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf

    Source code(tar.gz)
    Source code(zip)
Owner
Marcin Magnus
Ph.D., molecular biologist & bioinformatician, uses Pen & Paper and Emacs for notes, coding & RNA!
Marcin Magnus
Downloads data from OSM API and uploads it to the mapping sandbox.

OpenStreetMap To Sandbox This is a script to download data from OSM API and upload it to the mapping sandbox. Note that it clears all data in the sand

Ilya Zverev 5 Nov 27, 2022
A program which takes an Anime name or URL and downloads the specified range of episodes.

super-anime-downloader A console application written in Python3.x (GUI will be added soon) which takes a Anime Name/URL as input and downloads the ran

Sayyid Ali Sajjad Rizavi 26 Jul 18, 2022
The PornHub Downloader is a powerfull script used to download and manage both videos and pictures

The PornHub Downloader is a powerfull script used to download and manage both videos and pictures

16 Aug 31, 2022
Arxiv2Kindle is a simple script written in python that converts LaTeX source downloaded from Arxiv and recompiles it to better fit a Kindle or other similar reading devices.

Arxiv2Kindle is a simple script written in python that converts LaTeX source downloaded from Arxiv and recompiles it to better fit a read

Soumik Rakshit 8 Jul 09, 2022
A Celery application to collect data, download media and extract information from social media APIs

Project IBEX A Celery application to collect data, download media and extract information from social media APIs. Requirements You must have a Redis D

ibex 4 Dec 15, 2022
Download Web-10K data by querying Bing Image Search

gpv2-web10k This repository contains the script to download images from the Web-10K dataset. The script takes in a list of queries, queries Bing Image

AI2 8 Sep 06, 2022
Will load an SRC page, logged in with Firefox's cookies imported, and delete all comments from every run

SRCCommentsAutoDeleter Will load an SRC page, logged in with a support browser's cookies, and delete all comments from every run Config is all done in

3 Oct 29, 2021
TikTok - TikTok Bot to download video or audio from TikTok

TikTok - TikTok Bot to download video or audio from TikTok

JMTHON 51 Mar 04, 2022
A script that downloads YouTube videos/audio

YouTube-Downloader A script that downloads YouTube videos/audio from youtube. Usage Download the script by executing the following in your terminal :

Debayan Sarkar 2 Jan 04, 2022
YouTube-Video-Downloader - Download Youtube Videos for free.

YouTube-Video-Downloader Download Youtube Videos for free. Installing Dependencies:- Windows pip install pytube Mac/Linux pip3 install pytube Clonin

Xception Inc. 1 Jan 01, 2022
Download from HBO-MAX-BLIM-TV-Paramount

#HBO MAX- BlimTV -Paramount plus 4K Downloader Tool To download 4K HDR DV SDR from HBO MAX- BlimTV -Paramount plus Hello Fellow Developers/ ! Hi! M

4 Dec 25, 2021
Tool To download 4KHDR DV SDR from AppleTV

# APPLE-TV 4K Downloader Tool To download 4K HDR DV SDR from AppleTV Hello Fellow Developers/ ! Hi! My name is WVDUMP. I am Leaking the scripts to

5 Dec 25, 2021
📼Command line tool based on youtube-dl to easily download selected channels from your subscriptions.

youtube-cdl Command line tool based on youtube-dl to easily download selected channels from your subscriptions. This tool is very handy if you want to

Anatoly 64 Dec 25, 2022
Twitter Media Downloader (Telegram Bot)

Twitter Media Downloader (Telegram Bot)

Matin Baloochestani 8 Oct 27, 2022
A fast and small Torrent client made with Python 3.

pico-torrent A fast and small Torrent client made with Python 3. History and context It was programmed by a hacker known as Jazz_Man, around January o

Pindorama 9 Oct 04, 2022
A YouTube downloader which allows you to choose which video you want

Youtube Video Downloader Download multiple videos in one go! How to Use 1.First type the video you want to download 2.On clicking the Search button yo

2 Dec 17, 2021
An Inline Telegram bot that can download YouTube videos with permanent thumbnail support

Tube (YouTube Downloader) An Inline Telegram bot that can download YouTube videos with permanent thumbnail support About Bot need to be in Inline Mode

Renjith Mangal 30 Dec 14, 2022
Ebook downloader built using python

ebook-downloader Getting Started Open a terminal and run the following commands. git clone github.com/georgemunyoro/ebook-downloader cd ./ebook-downlo

George Munyoro 1 Oct 19, 2021
Tool to download Netflix in 4k

Netflix-4K-Script Tool to download Netflix in 4k You will need to get a L1 CDM that is whitelsited with Netflix CDM In this script are downgraded

9 Dec 23, 2021
YouTube Downloader is extremely simple program for downloading songs or playlists (in audio or video) from YouTube. Created using Python, PyTube and PySimpleGUI.

YouTube Downloader YouTube Downloader is extremely simple program for downloading songs or playlists (in audio or video) from YouTube. Disclaimer It's

Simeon Tsvetanov 3 Dec 14, 2022