Internet and Open-Access Publishing in Physics Research
Publication of research in most areas of physics has changed dramatically in the past decade, with nearly all research now being published on the Internet. To appreciate how this has happened, why it is here to stay, and how it is likely to spread to other areas, it is necessary to understand that publication in physics is essentially done via papers, usually rather narrowly focused short papers on a single topic. The few books that physicists have written are mainly pedagogical. In my field, theoretical physics, papers typically have one to four authors. In most areas of physics the order of authors on a paper is always alphabetical. The changes in how research is published, which I will describe below, have in turn significantly modified how research is done. Most research is still (also) published in journals, but their purpose is now largely archival: I and most physicists no longer subscribe to or read journals.
Every day anyone anywhere who finishes a paper posts it on the Internet, at www.arxiv.org. The next day anyone anywhere with Internet access can visit that site, read the title and author(s) of all the papers posted that day, click and read the abstracts if they wish, and then click and bring up any paper on their screen, click and print it. The arXiv (as it is named) was started by Paul Ginsparg in 1991 for theoretical particle physics, and has now expanded to most areas of physics as well as such theoretical fields of science as mathematics, quantitative biology, and so on. To help keep the system always accessible and responsive, and fast even for large information transfers, there are currently seventeen mirror sites worldwide, including three in the United States, five in Europe, and four in Asia. Currently the arXiv is supported mainly by Cornell University and the National Science Foundation. The costs are small, on the order of 2 percent of that of the main U.S. physics journal, Physical Review. The arXiv manifesto is “ArXiv is an openly accessible, moderated repository for scholarly papers in specific scientific disciplines. Material submitted to arXiv is expected to be of interest, relevance, and value to those disciplines. ArXiv was developed to be, and remains, a means for specific communities to exchange information” (www.arXiv.org). Note that the criteria do not directly include some that one might expect to find on the list, such as “correct.” Originally anyone could post papers, with essentially no content control or peer review. That has evolved to mild control—basically once one has posted something, one can then post anything in that and related areas. First-time authors need “endorsement” from someone who has posted something.
An unintended consequence of the existence of the arXiv with daily posting is that it has hugely accelerated the rate of research, and subtly shaped the form papers take. Research has shifted toward being a dialogue, or better, multilogue. Communication has always been very important for research in theoretical physics. In the past one might work on a topic for some months without much interaction with others. Now as one is working, relevant papers are appearing, so one integrates their results, and work moves rapidly.
Journal publication is still used for archival purposes, and for evaluations by committees, chairs, deans, and so on. The posted arXiv papers are not peer reviewed. If an active researcher cannot tell whether something is valid, it is his or her problem. It is pretty clear to experts what work is relevant. There are strong inhibitions against posting low-quality or wrong work because of the resulting damage to one's reputation. For two reasons this system is probably relatively easy to implement in theoretical physics compared to other areas, such as biology. First, in physics results are normally right or wrong, relevant or irrelevant, and it is not very hard to tell which. Second, most people who have been in the field for a while are acquainted with or at least aware of nearly all the others in the field, and with their work and biases and how likely they are to be correct.
In theoretical physics this open-access Internet publishing is an unqualified success. Will it spread to all areas of science and even more broadly? At the institutional level there is movement toward making this happen. As one example, the Abdus Salam International Centre for Theoretical Physics (based in Trieste) has very recently organized an open-access archive that allows the scientific work of any scientist from any country to be posted free of charge. Authors may upload preprints, reprints, conference papers, prepublication book chapters, and so on. Acceptable subjects include science areas such as physics, mathematics, biology, earth sciences, computer sciences; technology areas such as computer software and networking, environmental technology; education areas; science policy areas; and more.
CERN, the European particle physics center, in December 2005, hosted an international meeting, attended by about eighty representatives of major publishers, learned societies, funding agencies, and authors from Europe and the United States. Its goal was to promote open-access publishing. In March 2007, a task force recommended establishing a sponsoring consortium for open-access publishing in particle physics (SCOAP), in which a “global network of funding agencies, research laboratories, and libraries will contribute the necessary funding” (“Proposal”). Contributors will recover their payments by cancelling paper subscriptions; payments will be based on the number of scientific publications from a country or laboratory over a specified time period. It seems rather clear that in essentially all areas of quantitative theoretical science open-access publishing will be increasingly important. The American Association for the Advancement of Science (publisher of Science) has recently done a study on open access, available at www.alpsp.org (though it focuses on open-access journal publishing rather than independent posting such as the arXiv).
Moving to open-access publication will be more difficult in biology for several reasons. Evaluating the validity of reported results is considerably more difficult in biological areas and particularly in biomedical ones, where many more variables and considerations can affect the outcome of experiments and analyses. Science in these areas is less theoretical than in physics. There are far more practitioners, so it is much less likely that the people and their reputations are known to nearly everyone. It is harder to tell who actually did the work. The top journals (e.g., Science and Nature) currently refuse to publish a paper if it is first posted on the arXiv. Coming to terms with these issues, and finding a productive level of open-access publishing for areas other than theoretical science, will receive increasing attention in the near future.
The arXiv (and presumably open-access publishing in general) will keep evolving. Recently the arXiv added a new feature whose value and use level are not yet known. A qualified physicist with a blog can write a comment about a particular paper. Using a new protocol called Track-Backs, the blogger's website notifies the arXiv, which then provides a link to the blog next to the abstract of the paper. Anyone who looks at a paper can then click and read what others have written about it. Only those qualified to post on arXiv can comment, and TrackBacks from anonymous sites are not allowed.
Finally I will comment briefly on some of the themes of this anthology, plagiarism and scientific fraud. They provide further perspective on why open-access publishing has been and will be easier to implement in theoretical science than in other areas. Basically, plagiarism of writing and fraud are not important issues in theoretical science, whatever one might read from experts in these areas or in the media. First, the fraction of workers who might do these things is probably smaller than in other areas, partly because workers mostly are trained by example not to do it, and more importantly, because they are aware that they are highly likely to be caught. The results of science can be trusted, with high probability in the short term, and with very high probability in the longer term. That is not because every scientist is honest—not all are—but because if a paper or a result is interesting then knowledgeable people will quickly see it, read it, and try to reproduce the result. Copying and fraud will be spotted, and not ignored. Reproducing results can take longer if detailed calculations or lab measurements are involved, but they will be done. These mechanisms have operated effectively in all the well-publicized cases, with scientists catching the fraud about as quickly as possible, given the time needed for checking the results, despite current media and “ethics expert's” hype. The integrity of science is functioning just as it should and protecting the public as well as is possible. It is extremely difficult to fool scientists into thinking a false result is true (and, of course, the results of science are compared to a real world, so truth is not socially constructed).
Plagiarism of ideas is a somewhat larger problem, but not a significant one. The period from having an idea to showing the idea is not inconsistent with existing data and theory, and figuring out feasible tests of the idea can take weeks to months and can only be done by qualified scientists. Theoretical science is a communication-intensive area, so scientists mostly know what everyone in the world in their area is doing, and who has what ideas. Top research universities and labs have one to two seminars a week in each research area (theoretical particle physics, astrophysics, etc.), mostly from outside visitors, usually about recent or unfinished work. ArXiv posting settles literal priority (journal publication dates are no longer relevant). Plagiarism of ideas may occur, but is unlikely to go undetected; the subsequent damage to the reputation of those doing it acts as a deterrent.
Theoretical physicists and theoretical scientists in general are very happy with the arXiv and with open-access publishing. There seems to be a nearly ideal match with how research should be done in these areas. Some modifications will be needed for open-access publishing to spread to other areas of science, and beyond science, but I am confident that will probably happen.
“The arXiv Endorsement System.” www.arXiv.org/help/endorsement, consulted May 8, 2007.
“Proposal to Establish a Sponsoring Consortium for Open Access Publishing in Particle Physics.” http://doc.cern.ch/archive/electronic/cern/preprints/open/open2007-009.pdf. Consulted July 26, 2007.