Contact us Paperscape is being developed by Damien George (homepage) and Rob Knegjens (homepage). You can contact us by leaving a comment on this blog, sending us an email or by visiting the #paperscape irc channel on freenode.
41 thoughts on “Contact us”
Your map is very impressive! Congratulations!
I was wondering if you could use your data to test the six degrees of separation theory, i.e., that everyone and everything is six or fewer steps away. In some sense, if paper A cites paper B, and paper B cites paper C, then paper A and C would be two degrees apart from each other. And all you would need to connect any two papers on your map would be a maximum of six steps from each other.
I am just curious if this could be computed somehow with Paperscape.
Thank you Rodrigo!
Yes, it would be quite easy to use our database to compute the “degrees of separation” or “minimum path length” between any two papers. If we did this for all pairs of papers and worked out the maximum length then we could test the six-degrees hypothesis.
I am working on large databases as well, and I liked very much the visualization of data you chose.
I’d like to ask you which framework / js library you use to render the clusters, and handle the zoom in/out with almost 1M nodes.
Could you please share sm hint?
We’ve had several people inquire about our technical setup so we plan to post more details in the near future in the form of a blog post or on the About page.
To quickly answer your question though, the map is made up of pregenerated bitmap tiles that we regenerate ont he server each morning after we update the map with new papers. We display these tiles using a HTML5 canvas element. The dynamic overlays and underlays you see when you interact with the graph or search are also drawn using canvas elements. Aside from jQuery and requirejs we don’t use any other js libraries or frameworks. If you have any other specific questions please feel free to ask.
Hi guys! You have done a very nice job! We are looking for making something like a map of publications made by employees of our university and your solution looks quite sufficient to fulfill this task. Can you share a technology?
Thanks! We’re planning to post the details of the technologies we used to create Paperscape soon. This will include details about how we parse papers, create the graph, serve information from our server, and build the paperscape program that you interact with in the browser.
What an astounding image, a truly overwhelming and scientifically important landscape. I would love to have this printed onto a large canvas for the office wall, is it possible to get a full size file that could be printed?
That’s a really good idea, to have a hi-res version for printing. It would be relatively easy for us to generate. Approximately what resolution were you thinking of?
I’m not a 100% sure, the larger the better as you could always scale down without a loss in resolution? As always your thoughts are most welcome!
Sorry for the delay but we’ve finally got around to making posters of the paperscape map. See this blog post: http://blog.paperscape.org/?p=182
Please let us know if this is what you had in mind.
As large as possible, you can always scale down as required without loosing resolution, would this be possible?
This is an excellent visualization: well done! I used it as a basis for speculation about future Nobel Prize winners in this blog post:
Wow, this is a very beautiful project, congratulations. I have noticed some relatively large ‘islands’ with very distinct structure (there is a large one to the right of quantum physics) and I was wondering if you know why such structures appear.
The structures you are refering to (with lots of very small papers all the same size) are not very realistic. They are made up of papers for which we don’t have any reference information and so therefore cannot make links to other papers. Without reference information, we just put papers close to others in the same category. Such papers tend to clump in regions of low “anti-gravitational” potential, and make these island structures.
If we had reference information for these papers, then they would be placed in better locations on the map.
Hi guys. I’ve just discovered Paperscape recently, and read about your project. First of all I’d like to congratulate you on the idea and the implementation — it’s a great map of the arXiv.
I’m interested in the citations-count engine. If I understand correctly, you count the citations found in the arXiv articles themselves, and the circle area on the map is proportional to that count. I’d like to ask if you have a (publicly available) database of that citation count?
Regards and the best of luck with the project,
Yes, we count citations using some custom code. The raw data is available here: github.com/paperscape/paperscape-data
If you use it for anything interesting please let us know!
Dear Damien and Rob,
I just discovered recently PaperScape and I am very impressed with what it does. Congratulations for your work.
I would like to know whether PaperScape could more or less easily adapted to analyse another set of documents. Each document describes a research project proposal. Each document would be represented for instance by some free keywords, an abstract provided as XML or CSV file and possibly a narrative part of the project description provided as a PDF file.
Thanks in advance for your comments.
Excuses for the very late reply. Nice to hear you like Paperscape.
We have experimented with applying the Paperscape algorithms to other data sets. If the data is converted to the same graph format as our arXiv data is it is not too difficult to generate a map, however this map doesn’t automatically look good. In the case of the arxiv map it took quite some tuning of force parameters and link weights to achieve the current result. A key link weighting for example is the reference frequency – the number of times a given reference is cited within a paper.
Hope that answers your question
Paperscape is a very beautiful show in front of, I want to know the gravity algorithm of the graphics, not to provide reference.
You proposed the idea is very attractive to me, I really want to learn. The main is to understand the gravity algorithm not covered, as do beautiful, I can adjust myself. Don’t know what information from?
The gravity algorithm is that used by N-body galaxy simulation code in astrophysics and cosmology research. We use the Barnes-Hut algorithm with a quad-tree: https://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation
I also try to use algorithm with a quad-tree Barnes-Hut, but can not avoid the point of coverage, as well as the degree of aggregation is not very ideal, you can specifically introduce it? Would like to learn about your use, in order to promote my understanding of the only, if possible, can provide a reference to the source code? My email:email@example.com, I thank you in here
I sent you an e-mail, I do not know if you say it?
Hi, Very impressive graphics! I was wondering, in my.paperscape, individual papers are connected with different number of lines with different colors when clicked. What does the number of lines and the colors mean?
Thanks Sisi! In my.paperscape, to keep the graphics neat and tidy the citation links are folded together. The number (or thickness) of lines between papers represents a reference/citation in the entire chain of connected papers. For example, if you have 3 papers, A, B and C (C the newest, A the oldest), and if C cites A and B, and B cites A, then there will be two lines connecting C to B (one representing the citation to B itself, the other going through B to A), and two lines connecting B to A (one representing the citation of B to A, the other the citation of C to A).
When you click on a paper, the red lines reperesent the actual refs/cites of that paper, and the grey lines are the folded refs/cites of other papers.
Great! Looking forward to read your scientific paper 🙂
I love your site. I do have a “feaure-request” – would it be possible to allow for an author search to be available from the address-bar, much like inspire allows (in a form similar to “http://inspirehep.net/search?p=AUTHORNAME&action_search=Search”)?
Thanks for your interest in Paperscape.
You can add any search query to the address-bar by appending /?s=your_query to the paperscape.org url.
For example you could search for your name with:
or for supersymmetry:
For some characters you need to use url encoding (‘?’ -> %3F, spaces -> ‘+’ etc.). For example “?author Birnholtz” becomes:
Hope that helps
An extra feature would be useful: search by affiliation. It would allow to see the topics studied in the various groups.
Yes, this would be a good feature to have, and we have thought about implementing it. Unfortunately it’s not so easy because “affiliation disambiguation” is a hard problem to solve. We have the arXiv meta data for each paper but that doesn’t include affiliation (it does include author and title). So we would need to extract the info from the LaTeX source, and there are many ways to specify the affiliation in the source, each of which needs a special case to make the affiliation canonical.
Loving this tool, great job! How hard would it be to render the same data in 3D and make navigable as in this open source project:
Might not be useful per se but would add an extra level of awesome.
We did try generating the map in 3d but it was very difficult to navigate and make sense of. Also much harder to render in the browser because there are more than 1 million nodes.
Came across paperscape recently, really impressive tool! I’m interested in generating a similar map for my PhD project references, which are managed by Mendeley and output to a .bib file. The main difference is that most of the papers are not stored in the arXiv. Is it possible to generate a paperscape map for such a list of references? Presumably generating the graph is easy enough as I can put this into a similar format as on github, but what about the citation-finding code…?
We don’t include papers outside of arXiv, and don’t include support or code for exctracting or linking to external references outside out database.
You could generate a map for just the references in your project, and actually the map-generation code works very well for a small number of papers (eg 100-1000) because the nodes don’t get “stuck” so much and you can tune the parameters easily to get a nice looking graph.
But in terms of generating the links (references) for the nodes in your graph, you’ll need to find a way to do that yourself. You could do it by hand if the number of papers is small. Otherwise you’d need to write some code to do it automatically.
Thanks for the info. I might try a small sub-set of papers with links generated by hand. If that looks promising, I can look into writing something to generate the links automatically.
Hi, we are writing to you from the Hackathon Zurich! We are trying to use the paperscape code and wondered how you convert the arxiv data given by the CSV files into a mysql database or other valid form? Do you provide code for that?
Eagerly looking forward to your reply, best, Viktor
The CSV data files are a dump of our local SQL database. Sorry but we didn’t yet get around to publishing the code that generates the database and puts the entries into it.
is there a plan to do this in near future?
No, there are no plans to do this.
I saw your blog in an article on astrobites. I looked for a link t follow the blog via email. I don’t find anything in the way of an email address to follow the blog like I find on WordPress blogs. Can you help?
There isn’t a way to follow the blog via email. While the Paperscape map continues to update each day, there is currently very little in the way of development of the code/features, so you don’t miss much 🙂
Awesome work ! It would be very interesting trying to produce a “Question generator” that generates new question in holes, where a hole produced between two papers which close under the metric but far via citation path. Closing the gap of knowledge in a sense.