Zotero integration

The Zotero plugin for chrome, firefox, and other browsers use the Zotero Links to an external site.Zotero Web API v3 to talk to www.zotero.org Links to an external site. to maintain a cloud based bibliographic database. It also uses a local SQLite database to store the bibliographic information into a local user Library. Access to the local library is available via a Links to an external site.Zotero JavaScript API.

To make it easy to insert citations and bibliographies into documents there are also plugins for Microsoft Word and OpenOffice/Libreoffice.

These plugs define a set of subroutines or macros to perform a set of operations. In the case of OpenOffice/Libreoffice these are (according to the file Zotero.xba):

  • ZoteroAddBibliography()
  • ZoteroAddCitation()
  • ZoteroEditCitation()
  • ZoteroEditBibliography()
  • ZoteroRefresh()
  • ZoteroRemoveCodes()
  • ZoteroSetDocPrefs()

each calls code of the form (in this case for ZoteroAddBibliography()) :

	createUnoService("org.zotero.integration.ooo.ZoteroOpenOfficeIntegration").trigger("addBibliography")

Similarly a set of AddOns are defined with icons and a URL of the form:

service:org.zotero.integration.ooo.ZoteroOpenOfficeIntegration?addCitation

In the case of the Windows integration the following Visual Basic macros are defined:

  • PROJECT.ZOTERO.ZOTEROREFRESH
  • PROJECT.ZOTERO.ZOTEROREMOVECODES
  • PROJECT.ZOTERO.ZOTEROSETDOCPREFS
  • PROJECT.ZOTERO.ZOTEROEDITCITATION
  • PROJECT.ZOTERO.ZOTEROEDITBIBLIOGRAPHY
  • PROJECT.ZOTERO.ZOTEROINSERTBIBLIOGRAPHY
  • PROJECT.ZOTERO.ZOTEROINSERTCITATION
  • PROJECT.ZOTERO.ZOTEROADDEDITCITATION

The subroutines or macros invoke code that passes information to and from a specific Word Processor integration plugin installed in the browser . In the case of OpenOffice/Libreoffice this plugin (zoteroOpenOfficeIntegration.js) opens a TCP socket on port 23116 that listens for requests from the subroutines/macros. The plugin gets or store the information via the Zotero plugin in the browser. (Note that it also opens TCP port 19876, but this simply returns a message that you are using an outdated plugin.)

Using Wireshark to sniff the traffic for a simple document (wireshark-capture-20160709.pcapng Download wireshark-capture-20160709.pcapng), we can easily see the communication between the Word Processor plugin and the integration plugin running the the broswer. The Download simple_example-20160709.odt

was made by first inserting a single citation for RFC 1235 (and as a side effect selecting the bibliographic style for this file), the inserting a bibliography, then inserting a multiple citation (for RFCs 2792, 3554, and 2704). The final document looks like:

[1], [2–4]


[1]	J. Ioannidis and G. Maguire, ‘Coherent File Distribution Protocol’, Internet Request for Comments, vol. RFC 1235 (Experimental), Jun. 1991 [Online]. Available: http://www.rfc-editor.org/rfc/rfc1235.txt
[2]	M. Blaze, J. Ioannidis, and A. Keromytis, ‘DSA and RSA Key and Signature Encoding for the KeyNote Trust Management System’, Internet Request for Comments, vol. RFC 2792 (Informational), Mar. 2000 [Online]. Available: http://www.rfc-editor.org/rfc/rfc2792.txt
[3]	S. Bellovin, J. Ioannidis, A. Keromytis, and R. Stewart, ‘On the Use of Stream Control Transmission Protocol (SCTP) with IPsec’, Internet Request for Comments, vol. RFC 3554 (Proposed Standard), Jul. 2003 [Online]. Available: http://www.rfc-editor.org/rfc/rfc3554.txt
[4]	M. Blaze, J. Feigenbaum, J. Ioannidis, and A. Keromytis, ‘The KeyNote Trust-Management System Version 2’, Internet Request for Comments, vol. RFC 2704 (Informational), Sep. 1999 [Online]. Available: http://www.rfc-editor.org/rfc/rfc2704.txt

After the opening the TCP connection the word processor(hereafter client) sends a transaction to the integration plugin (hereafter server) with the command "addCitation". Note that this transaction consist of two 32 integers (in this case 0x00000000 and 0x0000000d) and the string "addCitation" (including the double quotation marks) - for a total of 21 bytes. Additionally, the client and server when sending a transaction set the PUSH flag. The server replies with an ACK, then sends a response of 45 bytes containing two 32 integers (in this case 0x00000001 and 0x00000025) followed by the string: ["Application_getActiveDocument",[3]] .The client replies with an ACK. The client next sends 13 bytes consisting of two 32 bit integers (in this case 0x00000001 and 0x00000005) and the string [3,1] . At this point we can note two patterns. The first 32 integer seems to be a transaction ID and the second 32 bit integer is the length of the string that follows.

The server replies with 40 bytes consisting of two 32 bit integers (in this case 0x00000002 and 0x00000020) and the string ["Document_getDocumentData",[1]] . The client sends and ACK and then sends a transaction of 10 bytes consisting of two 32 bit integers (in this case 0x00000002 and 0x00000002) and the string "" . The server sends an ACK and then sends a reply with 468 bytes consisting of two 32 bit integers (in this case 0x00000003 and 0x000001cc) and the string

["Document_setDocumentData",[1,"<data data-version=\"3\" zotero-version=\"4.0.29.10\"><session id=\"WfGUet40\"/><style id=\"http://www.ict.kth.se/courses/II2202/IEEElike-with-access.csl\" hasBibliography=\"1\" bibliographyStyleHasBeenSet=\"0\"/><pref name=\"fieldType\" value=\"ReferenceMark\"/><pref name=\"storeReferences\" value=\"true\"/><pref name=\"automaticJournalAbbreviations\" value=\"\"/><pref name=\"noteType\" value=\"\"/></prefs></data>"]]

The client responds with and ACK and 12 bytes consisting of two 32 bit integers (in this case 0x00000003 and 0x00000004) and the string null . The server responds with an ACK. Next the server sends 55 bytes consisting of two 32 bit integers (in this case 0x00000004 and 0x0000002f) and the string ["Document_canInsertField",[1,"ReferenceMark"]] . To which the client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000004 and 0x00000004) and the string true . Next the server sends 54 bytes consisting of two 32 bit integers (in this case 0x00000005 and 0x0000002e) and the string ["Document_cursorInField",[1,"ReferenceMark"]] . The client sends 12 bytesconsisting of two 32 bit integers (in this case 0x00000005 and 0x00000004) and the string null . The server now send 45 bytes consisting of two 32 bit integers (in this case 0x00000006 and 0x0000002e) and the string "Document_insertField",[1,"ReferenceMark",0]] . To which the client replies with an ACK and these sends 16 bytes consisting of two 32 bit integers (in this case 0x00000006 and 0x00000008) and the string [0,"",0] . The server now sends 38 bytes consisting of two 32 bit integers (in this case 0x00000007 and 0x0000001e) and the string ["Field_setCode",[1,0,"TEMP"]] . The client sends an ACK and then a transaction of 12 bytes consisting of two 32 bit integers (in this case 0x00000007 and 0x00000004) and the string null . The server replies with 50 bytes consisting of two 32 bit integers (in this case 0x00000008 and 0x0000002a) and the string ["Document_getFields",[1,"ReferenceMark"]] . The client replies with an ACK and transaction containg 26 bytes consisting of two 32 bit integers (in this case 0x00000008 and 0x00000012) and the string [[0],["TEMP"],[0]] . The server sends and ACK, then a transaction of 43 bytes consisting of two 32 bit integers (in this case 0x00000009 and 0x00000023) and the string ["Field_setText",[1,0,"[1]",false]] . The client sends an ACK, then send a transaction of 12 bytes consisting of two 32 bit integers (in this case 0x00000009 and 0x00000004) and the string null . The server sends an ACK and then a transaction of 31 bytes consisting of two 32 bit integers (in this case 0x0000000a and 0x00000017) and the string ["Field_getText",[1,0]] . The client replies with an ACK and then a transaction of 13 bytes consisting of two 32 bit integers (in this case 0x0000000a and 0x00000005) and the string "[1]" . The server replies with a transaction of 1011 bytes consisting of two 32 bit integers (in this case 0x0000000b and 0x000003eb) and the string

["Field_setCode",[1,0,"ITEM CSL_CITATION {\"citationID\":\"M6dUIB6w\",\"properties\":{\"formattedCitation\":\"[1]\",\"plainCitation\":\"[1]\"},\"citationItems\":[{\"id\":27516,\"uris\":[\"http://zotero.org/users/683389/items/34CDPXTJ\"],\"uri\":[\"http://zotero.org/users/683389/items/34CDPXTJ\"],\"itemData\":{\"id\":27516,\"type\":\"article-journal\",\"title\":\"Coherent File Distribution Protocol\",\"container-title\":\"Internet Request for Comments\",\"volume\":\"RFC 1235 (Experimental)\",\"abstract\":\"This memo describes the Coherent File Distribution Protocol (CFDP). This is an Experimental Protocol for the Internet community. It does not specify an Internet standard.\",\"URL\":\"http://www.rfc-editor.org/rfc/rfc1235.txt\",\"ISSN\":\"2070-1721\",\"author\":[{\"family\":\"Ioannidis\",\"given\":\"J.\"},{\"family\":\"Maguire\",\"given\":\"G.\"}],\"issued\":{\"date-parts\":[[\"1991\",6]]}}}],\"schema\":\"https://github.com/citation-style-language/schema/raw/master/csl-citation.json\"}"]]

We can observe that this response has return the citation information as a JSON encode value. If we pretty print the information following CSL_CITATION (after removing the "\" backquote before each doubel quote mark) we can see:

{
    "citationID": "M6dUIB6w",
    "citationItems": [
        {
            "id": 27516,
            "itemData": {
                "ISSN": "2070-1721",
                "URL": "http://www.rfc-editor.org/rfc/rfc1235.txt",
                "abstract": "This memo describes the Coherent File Distribution Protocol (CFDP). This is an Experimental Protocol for the Internet community. It does not specify an Internet standard.",
                "author": [
                    {
                        "family": "Ioannidis",
                        "given": "J."
                    },
                    {
                        "family": "Maguire",
                        "given": "G."
                    }
                ],
                "container-title": "Internet Request for Comments",
                "id": 27516,
                "issued": {
                    "date-parts": [
                        [
                            "1991",
                            6
                        ]
                    ]
                },
                "title": "Coherent File Distribution Protocol",
                "type": "article-journal",
                "volume": "RFC 1235 (Experimental)"
            },
            "uri": [
                "http://zotero.org/users/683389/items/34CDPXTJ"
            ],
            "uris": [
                "http://zotero.org/users/683389/items/34CDPXTJ"
            ]
        }
    ],
    "properties": {
        "formattedCitation": "[1]",
        "plainCitation": "[1]"
    },
    "schema": "https://github.com/citation-style-language/schema/raw/master/csl-citation.json"
}

The client send an ACK followed by a transaction of 12 consisting of two 32 bit integers (in this case 0x0000000b and 0x00000004) and the string null . Next the server sends 33 bytes consisting of two 32 bit integers (in this case 0x0000000c and 0x00000019) and the string ["Document_activate",[1]] . The client send an ACK followed by a transaction of 12 bytes consisting of two 32 bit integers (in this case 0x0000000c and 0x00000004) and the string null . Next the server sends 33 bytes consisting of two 32 bit integers (in this case 0x0000000d and 0x00000019) and the string ["Document_complete",[1]] . The client send an ACK followed by a transaction of 12 bytes consisting of two 32 bit integers (in this case 0x0000000d and 0x00000004) and the string null . The sends an ACK.

At this point the first citation has been inserted!

Next the user requests that the bibliography be inserted, thus the client sends the command addBibliography. This transaction is 25 bytes long and consisting of two 32 bit integers (in this case 0x00000000 and 0x00000011) and the string "addBibliography". Note that the transaction ID is again zero. In reply the server sends 45 bytes consisting of two 32 bit integers (in this case 0x0000000e and 0x00000025) and the string ["Application_getActiveDocument",[3]] . The client sends an ACK and a 13 byte reply consisting of two 32 bit integers (in this case 0x0000000e and 0x00000005) and the string [3,2] . The server sends 40 bytes consisting of two 32 bit integers (in this case 0x0000000f and 0x00000020) and the string ["Document_getDocumentData",[2]] . The client responds with 435 bytes consisting of two 32 bit integers (in this case 0x0000000f and 0x000001ab) and the string:

"<data data-version=\"3\" zotero-version=\"4.0.29.10\"><session id=\"WfGUet40\"/><style id=\"http://www.ict.kth.se/courses/II2202/IEEElike-with-access.csl\" hasBibliography=\"1\" bibliographyStyleHasBeenSet=\"0\"/><prefs><pref name=\"fieldType\" value=\"ReferenceMark\"/><pref name=\"storeReferences\" value=\"true\"/><pref name=\"automaticJournalAbbreviations\" value=\"\"/><pref name=\"noteType\" value=\"\"/></prefs></data>"

The server replies with 55 bytes consisting of two 32 bit integers (in this case 0x00000010 and 0x0000002f) and the string ["Document_canInsertField",[2,"ReferenceMark"]] . The client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000010 and 0x00000004) and the string true . The server now sends 54 bytes consisting of two 32 bit integers (in this case 0x00000011 and 0x0000002e) and the string ["Document_cursorInField",[2,"ReferenceMark"]] . The client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000011 and 0x00000004) and the string null . The server now sends 54 bytes consisting of two 32 bit integers (in this case 0x00000012 and 0x0000002e) and the string ["Document_insertField", [2,"ReferenceMark",0]] . The client replies with 16 bytes consisting of two 32 bit integers (in this case 0x00000012 and 0x00000008) and the string [0,"",0] . The server follows this by sending 38 bytes consisting of two 32 bit integers (in this case 0x00000013 and 0x0000001e) and the string ["Field_setCode",[2,0,"BIBL"]] . The client replies with 18 bytes consisting of two 32 bit integers (in this case 0x00000013 and 0x00000004) and the string null . The server sends 50 bytes consisting of two 32 bit integers (in this case 0x00000014 and 0x0000002a) and the string ["Document_getFields",[2,"ReferenceMark"]] . The client replies with 1010 bytes consisting of two 32 bit integers (in this case 0x00000014 and 0x000003ea) and the string [[1,0],["ITEM CSL_CITATION {\"citationID\":\"M6dUIB6w\",\"properties\":{\"formattedCitation\":\"[1]\",\"plainCitation\":\"[1]\"},\"citationItems\":[{\"id\":27516,\"uris\":[\"http://zotero.org/users/683389/items/34CDPXTJ\"],\"uri\":[\"http://zotero.org/users/683389/items/34CDPXTJ\"],\"itemData\":{\"id\":27516,\"type\":\"article-journal\",\"title\":\"Coherent File Distribution Protocol\",\"container-title\":\"Internet Request for Comments\",\"volume\":\"RFC 1235 (Experimental)\",\"abstract\":\"This memo describes the Coherent File Distribution Protocol (CFDP). This is an Experimental Protocol for the Internet community. It does not specify an Internet standard.\",\"URL\":\"http://www.rfc-editor.org/rfc/rfc1235.txt\",\"ISSN\":\"2070-1721\",\"author\":[{\"family\":\"Ioannidis\",\"given\":\"J.\"},{\"family\":\"Maguire\",\"given\":\"G.\"}],\"issued\":{\"date-parts\":[[\"1991\",6]]}}}],\"schema\":\"https://github.com/citation-style-language/schema/raw/master/csl-citation.json\"}","BIBL"],[0,0]] . The server sends 31 bytes consisting of two 32 bit integers (in this case 0x00000015 and 0x00000017) and the string ["Field_getText",[2,0]] . The client sends 20 bytes consisting of two 32 bit integers (in this case 0x00000015 and 0x0000000c) and the string "{Citation}" . The server sends 71 bytes consisting of two 32 bit integers (in this case 0x00000016 and 0x0000003f) and the string ["Field_setCode",[2,0,"BIBL {\"custom\":[]} CSL_BIBLIOGRAPHY"]] .The client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000010 and 0x00000004) and the string null . The server sends 68 bytes consisting of two 32 bit integers (in this case 0x00000017 and 0x0000003c) and the string ["Document_setBibliographyStyle",[2,-384,384,240,0,[384],1]] . The client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000017 and 0x00000004) and the string null . The server sends 468 bytes consisting of two 32 bit integers (in this case 0x00000018 and 0x000001cc) and the string

["Document_setDocumentData",[2,"<data data-version=\"3\" zotero-version=\"4.0.29.10\"><session id=\"WfGUet40\"/><style id=\"http://www.ict.kth.se/courses/II2202/IEEElike-with-access.csl\" hasBibliography=\"1\" bibliographyStyleHasBeenSet=\"1\"/><prefs><pref name=\"fieldType\" value=\"ReferenceMark\"/><pref name=\"storeReferences\" value=\"true\"/><pref name=\"automaticJournalAbbreviations\" value=\"\"/><pref name=\"noteType\" value=\"\"/></prefs></data>"]]

The client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000018 and 0x00000004) and the string null . The server sends 300 bytes consisting of two 32 bit integers (in this case 0x00000019 and 0x00000124c) and the string ["Field_setText",[2,0,"{\\rtf [1]\\tab J. Ioannidis and G. Maguire, \\uc0\\u8216{}Coherent File Distribution Protocol\\uc0\\u8217{}, {\\i{}Internet Request for Comments}, vol. RFC 1235 (Experimental), Jun. 1991 [Online]. Available: http://www.rfc-editor.org/rfc/rfc1235.txt\r\n\\\r\n}",true]] .The client replies with 12 bytes consisting of two 32 bit integers (in this case 0x00000019 and 0x00000004) and the string null . The server sends 33 bytes consisting of two 32 bit integers (in this case 0x0000001a and 0x000001019) and the string ["Document_activate",[2]] . The client send and ACK and the a transaction with 12 bytes consisting of two 32 bit integers (in this case 0x0000001a and 0x00000004) and the string null . The server sends 33 bytes consisting of two 32 bit integers (in this case 0x0000001b and 0x000001019) and the string ["Document_complete",[2]] . The client send and ACK and the a transaction with 12 bytes consisting of two 32 bit integers (in this case 0x0000001b and 0x00000004) and the string null . The sends and ACK.

At this point the client sends another addCitation command, again with a zero transaction ID. The server responds with 45 bytes consisting of two 32 bit integers (in this case 0x0000001c and 0x00000025) and the string ["Application_getActiveDocument",[3]] . The client responds with a [3,3] . Then the server send ["Application_getActiveDocument",[3]]. The client responds with

"<data data-version=\"3\" zotero-version=\"4.0.29.10\"><session id=\"WfGUet40\"/><style id=\"http://www.ict.kth.se/courses/II2202/IEEElike-with-access.csl\" hasBibliography=\"1\" bibliographyStyleHasBeenSet=\"1\"/><prefs><pref name=\"fieldType\" value=\"ReferenceMark\"/><pref name=\"storeReferences\" value=\"true\"/><pref name=\"automaticJournalAbbreviations\" value=\"\"/><pref name=\"noteType\" value=\"\"/></prefs></data>"

... until eventually we get from the server a ["Field_setText",[3,1,"{\\rtf [2\\uc0\\u8211{}4]}",true]], which sets the [2-4] citation in the text.

Next there is a request for the bibliography (I am not sure how). This leads to the following messages from the service, I have elided the client's responses of null:

["Field_setCode",[3,2,"BIBL {\"custom\":[]} CSL_BIBLIOGRAPHY"]]

["Field_setText",[3,2,"{\\rtf [1]\\tab J. Ioannidis and G. Maguire, \\uc0\\u8216{}Coherent File Distribution Protocol\\uc0\\u8217{}, {\\i{}Internet Request for Comments}, vol. RFC 1235 (Experimental), Jun. 1991 [Online]. Available: http://www.rfc-editor.org/rfc/rfc1235.txt\r\n\\\r\n[2]\\tab M. Blaze, J. Ioannidis, and A. Keromytis, \\uc0\\u8216{}DSA and RSA Key and Signature Encoding for the KeyNote Trust Management System\\uc0\\u8217{}, {\\i{}Internet Request for Comments}, vol. RFC 2792 (Informational), Mar. 2000 [Online]. Available: http://www.rfc-editor.org/rfc/rfc2792.txt\r\n\\\r\n[3]\\tab S. Bellovin, J. Ioannidis, A. Keromytis, and R. Stewart, \\uc0\\u8216{}On the Use of Stream Control Transmission Protocol (SCTP) with IPsec\\uc0\\u8217{}, {\\i{}Internet Request for Comments}, vol. RFC 3554 (Proposed Standard), Jul. 2003 [Online]. Available: http://www.rfc-editor.org/rfc/rfc3554.txt\r\n\\\r\n[4]\\tab M. Blaze, J. Feigenbaum, J. Ioannidis, and A. Keromytis, \\uc0\\u8216{}The KeyNote Trust-Management System Version 2\\uc0\\u8217{}, {\\i{}Internet Request for Comments}, vol. RFC 2704 (Informational), Sep. 1999 [Online]. Available: http://www.rfc-editor.org/rfc/rfc2704.txt\r\n\\\r\n}",true]]) Links to an external site.

["Document_activate",[3]]

and finally a ["Document_complete",[3]] from the server. Now the Bibliography is set in the document and the sequence is complete.