1 00:00:05,140 --> 00:00:08,920 Well, good afternoon, everyone, and thank you so much for coming. This is very special. 2 00:00:09,760 --> 00:00:15,219 It's not the first time I've introduced a lecture, but I'm introducing it. The first what we think is a new cycle of a lectures. 3 00:00:15,220 --> 00:00:18,340 So that's a very special moment for the Voltaire Foundation. 4 00:00:18,790 --> 00:00:25,959 We've had for 25 years or more. We've had an annual lecture called The Best one lecture that I know wrong with all us here. 5 00:00:25,960 --> 00:00:31,120 But, um, where we invite enlightenment historians and literary critics to come and give a talk. 6 00:00:31,630 --> 00:00:38,530 But in recent years, for lots of reasons that we know about, the Voltaire Foundation has been moving more and more towards digital research. 7 00:00:39,250 --> 00:00:45,639 Um, when I first became director, we, I was given to, to sell on my first day in the office, a CD-Rom. 8 00:00:45,640 --> 00:00:50,770 That's a bit historic, isn't it? And then we produced a database in the early 2000s, the electronic enlightenment. 9 00:00:51,090 --> 00:00:57,430 Now we're working on digitising from scratch in a since the 201 volumes of the complete works of old. 10 00:00:57,640 --> 00:01:04,450 So there's always been although we've been primarily a paper publisher until recently, we've always been doing digital research. 11 00:01:05,140 --> 00:01:10,570 And over the last few years, particularly working with Glenn, the digital research has become terribly important. 12 00:01:10,570 --> 00:01:16,300 As we've accumulated more data, there's been more opportunities and more excitement in exploiting that data and thinking about how to use it. 13 00:01:16,690 --> 00:01:24,280 So it seemed in the in the evolution of the Voltaire Foundation, this is a good moment for us to sort of make the point by having an annual lecture. 14 00:01:24,640 --> 00:01:32,950 And, um, last year we launched a new online open access journal, Digital Enlightenment Studies. 15 00:01:34,020 --> 00:01:41,819 I'm, I can't say take a copy with you because it's online, but you can take with you a postcard, a beautiful design postcard, uh, outside. 16 00:01:41,820 --> 00:01:44,010 So please have a look. It's a fantastic journal. 17 00:01:44,010 --> 00:01:52,530 It's the it's establishing a whole area of research, and it contains wonderful articles, including one by Nico and his team in Helsinki were here. 18 00:01:52,800 --> 00:01:56,490 And if you're inspired to send us an article, we'd like that too. 19 00:01:56,610 --> 00:02:02,939 So, um, we've begun by, uh, founding this new online journal, Open Access. 20 00:02:02,940 --> 00:02:11,700 It looks beautiful. And we're now in, in the same spirit, beginning a new series of, uh, annual lectures. 21 00:02:12,150 --> 00:02:17,100 Um, it's a huge pleasure to have, Glenn to give this first lecture in the series. 22 00:02:17,610 --> 00:02:24,299 Um, I think we will. We've met probably before, but the first time we met and worked together for a long time was when you were Mellon, 23 00:02:24,300 --> 00:02:31,470 um, fellow in digital humanities, uh, funded, funded by the Mellon Foundation in 2011 to 2013. 24 00:02:31,890 --> 00:02:39,990 And although you then left us and you went to the A and you were you were at the centre for research and you, of course, before that. 25 00:02:39,990 --> 00:02:47,550 And since then you've been working in Chicago on the Art Artful project, producing amazing resources, and not least the Encyclopédie. 26 00:02:48,090 --> 00:02:54,030 Um, and since 2018, Glenn has been professor of D.H. in the French department at the Sorbonne. 27 00:02:54,810 --> 00:02:57,990 Um, but in all that time, you've also had a connection with the. 28 00:02:57,990 --> 00:03:00,180 But we've maintained the connection with the Voltaire Foundation. 29 00:03:00,810 --> 00:03:07,140 And, uh, you're currently, since 2021, the Astra Foundation research fellow in digital humanities. 30 00:03:08,100 --> 00:03:13,440 And I also want to say a little bit of celebrating the way that the Voltaire Foundation is moving into digital research. 31 00:03:14,130 --> 00:03:20,490 I do want to pay particular thanks to the Astra Foundation, um, who's represented by Elisabet. 32 00:03:20,850 --> 00:03:28,350 The also Foundation three, four years ago made us an extraordinary grant which has enabled us to really establish ourselves in this area. 33 00:03:28,740 --> 00:03:34,379 And this is a really transformational grant which has enabled us first to establish ourselves 34 00:03:34,380 --> 00:03:39,360 in the area of digital research and to employ lots of younger researchers and younger people. 35 00:03:39,600 --> 00:03:46,780 So it's played a really key part in helping us develop in that direction and encourage young researchers and scholars in Oxford. 36 00:03:46,800 --> 00:03:50,880 We're incredibly grateful. Um, some of that was such as a hit, some lights, 37 00:03:51,270 --> 00:04:01,320 and Glen has been the Astra Foundation research fellow and has been a key player in helping us directs that, um, that new direction of AI research. 38 00:04:01,770 --> 00:04:08,700 Um, I also should mention Glenn's publications. Um, he published his thesis interest on the Literary Theory of Peggy, published by Oxford. 39 00:04:09,090 --> 00:04:14,520 Um, I know that you were just a specialist for humanities research on political vocabularies. 40 00:04:14,880 --> 00:04:20,940 Um, we work together on a book on Voltaire's correspondence and readings of quotation in Voltaire's correspondence. 41 00:04:21,390 --> 00:04:27,930 Uh, this is a lovely digitising. Had to stick with Glenn Row, you know, monograph series published by Oxford. 42 00:04:28,380 --> 00:04:33,030 Um, I'm here to advertise. Obviously. Um, we have copies on sale afterwards. 43 00:04:33,300 --> 00:04:38,250 Um, and recently, I think Glenn's published a book called Observability of Earth, which we published. 44 00:04:38,250 --> 00:04:46,890 We've done it with digital Zoom. So there's a hugely impressive list of publications, not to mention immensely distinguished articles. 45 00:04:47,340 --> 00:04:50,760 Glenn and his colleagues from Chicago over here, 46 00:04:50,760 --> 00:05:00,120 Robert Silk Lucas here of have done really pioneering work in thinking about data reuse and citation strategy and the Oscar party and the beast. 47 00:05:00,750 --> 00:05:04,970 Um, and some of those things I think may come back into this lecture today. 48 00:05:05,010 --> 00:05:12,659 I'm not sure I can get us clear what he's going to talk about. But crucially, Glenn's also just pulled off the the magic for everybody's session. 49 00:05:12,660 --> 00:05:18,240 And, you know, so you consolidate a grant I think in year two which has a great title and a great logo. 50 00:05:18,510 --> 00:05:26,220 Modelling enlightenment is modern. I should have said modern modelling enlightenment, reassembling networks of modernity through data driven research. 51 00:05:26,910 --> 00:05:34,020 So the idea here is to use big data and to use that to shape that with thought experiment to to try and remap. 52 00:05:34,110 --> 00:05:40,790 I think, well, I'm not Glenn, but I think the attempt will be to remap the way we can eventually think about French Enlightenment, 53 00:05:40,800 --> 00:05:46,260 French Enlightenment writing. Um, I'm not sure how much today's lecture will draw on that. 54 00:05:46,260 --> 00:05:52,980 We're going to see, but you have a great title, Poetics of Text Reuse Digital Intertextuality in the 18th Century archive. 55 00:05:53,250 --> 00:06:03,350 Glenn is a fantastic pleasure to have you. Thank you very much. But I thank you so much, Nicholas. 56 00:06:03,350 --> 00:06:04,729 And thank you all to to coming. 57 00:06:04,730 --> 00:06:11,420 It's it's a really, really genuine and great pleasure to be here today to to to give this first annual lecture and digital, 58 00:06:11,780 --> 00:06:15,319 uh, enlightenment studies uh, which is, as Nicholas just mentioned, 59 00:06:15,320 --> 00:06:19,060 a field in which we're heavily invested, uh, in, in that collectively at the vault, 60 00:06:19,070 --> 00:06:25,490 our foundation in Chicago, thanks to the Astra Foundation, uh, at the Sorbonne, uh, sort of a bit all over. 61 00:06:25,790 --> 00:06:31,879 Uh, we're committed to, to to promoting, uh, this new field of research and to present to the world, 62 00:06:31,880 --> 00:06:37,520 both in the form of this public lecture, but also the new open access, the journal that Nicholas mentioned. 63 00:06:38,090 --> 00:06:45,860 The reason that so for both of these initiatives, uh, is the realisation that that this field of study, uh, 64 00:06:45,860 --> 00:06:52,100 digital approaches to the 18th century in general has reached a certain maturity and digital collections, 65 00:06:52,130 --> 00:06:59,000 uh, a certain critical mass, uh, which warrants and welcomes new scholarship and forms of academic outreach. 66 00:06:59,720 --> 00:07:06,350 Uh, this is not to say that digital, uh, research has fully and finally been accepted by the humanities mainstream, 67 00:07:06,470 --> 00:07:10,940 uh, but simply that we're perhaps on a better footing today than at any time in the past two decades. 68 00:07:11,720 --> 00:07:18,020 Uh, certain barriers remain, uh, not just for digital 18th century studies, but for the wider digital humanities world, 69 00:07:18,110 --> 00:07:23,510 uh, in general, who are still in some venues, uh, kept at a disciplinary distance. 70 00:07:24,320 --> 00:07:28,820 It is often said, for example, that, uh, D.H. lacks sufficient theoretical grounding. 71 00:07:29,750 --> 00:07:33,230 It's just the new positivism and all praxis, no theory. 72 00:07:33,740 --> 00:07:39,590 Uh, and this is somewhat true if you attack literary and historical questions from a purely computational perspective, 73 00:07:40,070 --> 00:07:47,570 uh, one text is the same as another, whether it's a novel or a grocery list once reduced to ones and zeros. 74 00:07:48,200 --> 00:07:52,939 Uh, if, however, uh, you start from a theoretical framework such as intertextuality, 75 00:07:52,940 --> 00:07:58,159 as we'll see today, or the idea of modelling as an epistemological exercise to, 76 00:07:58,160 --> 00:08:03,770 uh, follow Willard McCarty, then the conversation between human and machine becomes all the more engaging. 77 00:08:04,280 --> 00:08:08,510 And this is especially true in our current age of so-called artificial intelligence, 78 00:08:09,080 --> 00:08:17,600 where large language models capable of reading whole universes of text and generating humanlike responses based on an almost infinite archive, 79 00:08:18,170 --> 00:08:22,610 and are in dire need of theoretical, not to mention ethical, grounding from the humanities. 80 00:08:23,150 --> 00:08:26,810 These models are the fever dream, I would argue, of post-structuralism, 81 00:08:27,770 --> 00:08:34,850 and we need scholars conversant in theory with a capital T to understand these new artificial text realities. 82 00:08:35,770 --> 00:08:41,680 What I'd like to present to you today, then, is perhaps surprisingly, much less technical and much more theoretical. 83 00:08:41,920 --> 00:08:46,090 Uh, in terms of scope, uh, and this is quite intentional on my part, uh, 84 00:08:46,090 --> 00:08:53,170 in a modest effort to demonstrate that thinking with and thinking through computational methods and approaches 85 00:08:53,740 --> 00:09:00,219 to literary historical data can lead to real insights and uncover the rich complexity of humanistic, 86 00:09:00,220 --> 00:09:05,530 generative intelligences, uh, that are all too often hidden in our ever growing digital archives. 87 00:09:06,130 --> 00:09:09,250 So one such computational approach that we've been, uh, 88 00:09:09,250 --> 00:09:14,320 using for years at the University of Chicago and elsewhere is the automatic detection of text reuse. 89 00:09:15,100 --> 00:09:20,919 And this is a as I said, an old, old, old in computational terms goes back to 2002, 90 00:09:20,920 --> 00:09:27,489 at least in its first official, uh, naming of text to use, uh, and it's quite, uh, prosaic in its definition. 91 00:09:27,490 --> 00:09:31,930 It's the reuse of existing written sources in the creation of a new text. 92 00:09:32,210 --> 00:09:37,420 Uh, and in this paper, they talk about a sort of similarity spectrum where you have information retrieval, 93 00:09:37,420 --> 00:09:43,090 which is the classic, uh, document retrieval methods that go back to the 60s and 70s, 94 00:09:43,120 --> 00:09:47,199 uh, bag of words models, if that's if that means anything to you, uh, 95 00:09:47,200 --> 00:09:52,780 where we think about how documents are related to each other in a sort of semantic similarity space. 96 00:09:53,260 --> 00:09:56,140 Text reuse is somewhere in the middle between the two, 97 00:09:56,350 --> 00:10:01,570 and then on the right is the near duplicate detection, which we know and love as plagiarism detection. 98 00:10:01,990 --> 00:10:10,060 Uh, in all of its glory. To think about this, theoretically, we might say that text reuse is a form of intertextuality. 99 00:10:10,750 --> 00:10:15,330 Uh, and you could, which is a theory that comes out of the 60s and 70s in France. 100 00:10:15,340 --> 00:10:21,980 Uh, Julia Kristeva is the first to use the term, uh. She uses it as a translation of Maxine's notion of dialogue ism. 101 00:10:22,060 --> 00:10:25,270 Uh, and then it sort of takes off in the telco circle from there. 102 00:10:25,660 --> 00:10:31,360 Uh, you could have a sort of conservative approach to intertextuality, which would be the Harold Bloom School of influence, 103 00:10:31,870 --> 00:10:36,729 uh, where the, uh, criticism of the art of knowing the hidden roads that go from poem to poem, 104 00:10:36,730 --> 00:10:42,280 where poetic geniuses sort of have to grapple with the great giants of the past, 105 00:10:42,280 --> 00:10:49,210 and they necessarily borrow from these, uh, from these, uh, predecessors to assert their own poetic authority. 106 00:10:49,810 --> 00:10:53,799 More radically, uh, is the Berkeley and School of of intertextuality, 107 00:10:53,800 --> 00:10:58,750 where every text becomes the sort of inter texts where other texts are always present in it at various levels, 108 00:10:59,110 --> 00:11:04,060 in more or less recognisable forms, texts from the previous culture and those from the surrounding culture. 109 00:11:04,270 --> 00:11:08,890 Every text then becomes the new fabric or tissue of quotations from the past. 110 00:11:09,490 --> 00:11:15,140 Uh, between these two. I wouldn't say our position is between these two views. 111 00:11:15,470 --> 00:11:24,080 I'd take a more middle road. Not surprisingly either more Ginetta in in my in my take of of intertextuality engineered to in his 112 00:11:24,080 --> 00:11:29,240 great work on palimpsest two defines it as a relationship of co-presence between two or more texts. 113 00:11:29,720 --> 00:11:36,290 That is to say, identically and most often by the actual presence of one text in another in its most explicit and literal form. 114 00:11:37,130 --> 00:11:40,910 It is the traditional practice of citation, which I now put in the middle of our spectrum, 115 00:11:41,840 --> 00:11:45,140 with or without inverted commas, with or without precise reference, 116 00:11:45,620 --> 00:11:51,830 when the less explicit and less canonical form that of plagiarism, uh, which is an undeclared but still literal borrowing. 117 00:11:52,310 --> 00:11:55,580 Clearly this today we think of text reuse a bit. 118 00:11:56,210 --> 00:12:01,490 It's a bit sad, actually, as a form of plagiarism and almost exclusively as a form of plagiarism. 119 00:12:01,940 --> 00:12:08,270 And this is because we have this notion that words belong to certain authors and can't be reused without their authorisation. 120 00:12:08,670 --> 00:12:10,909 Uh, this is played out recently in academia. 121 00:12:10,910 --> 00:12:21,290 And if you follow the poor plight of Claudine Gay, the first African-American woman president of Harvard who was fired, uh, ostensibly for plagiarism. 122 00:12:21,320 --> 00:12:28,580 Uh, this article in the Guardian tells you exactly what a what a broad section this notion of plagiarism is. 123 00:12:29,180 --> 00:12:33,200 And a certain Barbara Glatt, who was responsible for the turn it in software. 124 00:12:33,200 --> 00:12:39,229 If you've ever used it, you know about it can be direct copying or something word for word without attribution. 125 00:12:39,230 --> 00:12:42,290 Indirect the wholesale theft of ideas. Mosaic. 126 00:12:42,290 --> 00:12:49,099 And this is a term that will come back today. Hold it in your mind, uh, changing some words while copying others, or even an honest mistake, 127 00:12:49,100 --> 00:12:55,190 an error of omission and execution all things gay was accused to have done, even as she continues to stand by her scholarship. 128 00:12:55,190 --> 00:12:58,790 These are all things I think we all do as, uh, as scholars. 129 00:12:59,180 --> 00:13:05,750 Uh, and it's a it's once you move into the legalistic realm of plagiarism, that text reuse becomes a problematic practice. 130 00:13:05,960 --> 00:13:12,260 But in the same article, if you move down, uh, the professor of uh, linguistic anthropology at Notre Dame, Susan Blum, 131 00:13:12,500 --> 00:13:18,980 says that there's a sort of continuum, that it's too cut and dry to think of plagiarism just as as it's theft of other, 132 00:13:19,310 --> 00:13:23,000 other people's words, and that somewhere between originality and complete copying, 133 00:13:23,000 --> 00:13:28,729 there's a language and culture that moves into the middle, and that it is in this language and culture. 134 00:13:28,730 --> 00:13:35,060 That term, perhaps text reuse inhabits, uh, this middle, this middle ground. 135 00:13:35,330 --> 00:13:38,090 And I would say that that's exactly the case in the 18th century, 136 00:13:38,090 --> 00:13:46,819 where they didn't yet have this notion of plagiarism as the theft of another author's word and that the culture, uh, gravitated. 137 00:13:46,820 --> 00:13:51,560 The culture of citation gravitated between these two poles of originality and imitation. 138 00:13:52,280 --> 00:13:56,210 And in a nutshell, the 18th century clearly is, uh. 139 00:13:57,480 --> 00:14:01,440 Is dominated still at the beginning of the 18th century by this classical notion of imitation, 140 00:14:01,710 --> 00:14:06,030 uh, which aesthetically was the model for, uh, hundreds of years. 141 00:14:06,660 --> 00:14:14,370 And, uh, it it's predicated on this notion of the value of a work, uh, came from its conformity to the great works of the past. 142 00:14:15,210 --> 00:14:20,280 And it was through these works that we could gain access to truth, beauty, nature, uh, 143 00:14:20,280 --> 00:14:26,370 all of these higher level ideas of which the ancients were somehow the strict equivalent. 144 00:14:26,910 --> 00:14:35,700 So, speaking of Virgil, for example, Pope remarks, uh, that perhaps he seemed above the critics law and but from nature's fountain scorn to draw. 145 00:14:36,300 --> 00:14:40,710 But went examined. On every party came nature. And Homer were he found the same. 146 00:14:44,200 --> 00:14:51,129 From the 18th century onwards and under the influence of German Romanticism, in particular literary works, 147 00:14:51,130 --> 00:14:58,230 we understand as more or less the emanation of what was unique in each individual their genius, and at the same time the post. 148 00:14:58,720 --> 00:15:05,470 The poet developed a new and immediate relationship with nature. Uh, one that didn't have to be based on representations of the ancients. 149 00:15:05,890 --> 00:15:14,860 To see this in action. We look to Racine's preface to Britannicus from from 1676, where he praised the conformity of his work to that of Tacitus. 150 00:15:15,180 --> 00:15:19,090 I copied my characters after the greatest painter of antiquity, Tacitus, 151 00:15:19,990 --> 00:15:25,600 and I was then so full of reading of this excellent historian that there was not a striking feature in my tragedy, 152 00:15:25,900 --> 00:15:29,560 for which he did not give me the idea for the model clearly of imitation. 153 00:15:29,590 --> 00:15:36,850 Compare this with just 100 years later with Rousseau, as we know in his opening volley of the confessions, 154 00:15:36,850 --> 00:15:40,840 where he takes pride in having no predecessor and indeed no successor. 155 00:15:41,340 --> 00:15:48,130 Uh, I am creating a work that is without prior example and whose execution will have no imitators. 156 00:15:48,610 --> 00:15:57,130 Never mind that generically he borrows his title from Saint Augustine, and then thematically he is very much continuing just the work of Montana. 157 00:15:57,850 --> 00:15:59,950 Uh, Montana is in fact, uh, 158 00:16:00,100 --> 00:16:07,120 key here because he demonstrates nicely for us the inherent ambiguity of a writer who is at once imbued with the tradition of imitation, 159 00:16:07,840 --> 00:16:14,470 inserting his flirt into his text and already aware of his identity as a unique and original author, 160 00:16:14,740 --> 00:16:17,880 giving himself over to the reader to turn to a new. 161 00:16:19,400 --> 00:16:24,230 This notion of originality is inseparable from, I would say, 162 00:16:24,620 --> 00:16:29,840 the idea of our idea of the formation of an idea that we can have of text for use in the 18th century. 163 00:16:30,830 --> 00:16:33,440 And this is the notion that Ronan McKay, for example, 164 00:16:33,440 --> 00:16:38,570 is demonstrated as being rather new to the 18th century and in fact tributary of the French Enlightenment. 165 00:16:38,990 --> 00:16:47,900 Uh, this notion of originality as an aesthetic category, and in fact, the word isn't used in French until the until the very end of the 17th century. 166 00:16:48,320 --> 00:16:54,470 Uh, by, uh, talking about the originality of paintings, one of a kind that can't be copied. 167 00:16:56,520 --> 00:17:03,300 Zero goes on to use it in 1759, uh, as describing someone of a very original character. 168 00:17:03,660 --> 00:17:09,470 The originality of the Baron declaration. The Danish minister to France to Sophie Valon. 169 00:17:10,950 --> 00:17:14,249 The first dictionary enters the lexicon in 1743. 170 00:17:14,250 --> 00:17:19,840 In the dictionary rival, who proudly proclaims that you won't find this word in any other dictionary. 171 00:17:20,010 --> 00:17:28,400 So the originality of the definition of originality is here proclaimed the then to be repeated, but, uh, 172 00:17:28,530 --> 00:17:34,019 a bit laconically in the dictionary Academy fonzie's as just the character of that which is original, uh, 173 00:17:34,020 --> 00:17:40,710 of people or things, the the great Encyclopédie of which there will certainly be question today, uh, 174 00:17:41,010 --> 00:17:46,799 plays with this semantic shift between original and originality, and it can be seen in these two entries. 175 00:17:46,800 --> 00:17:53,040 And this is exactly how the UN security works, uh, which seems to be, uh, 176 00:17:53,040 --> 00:17:57,779 which presents both the aesthetics of imitation that is still operative in the 18th century, 177 00:17:57,780 --> 00:18:02,000 but also presents a that's sort of almost already exhausted, uh, by this time. 178 00:18:02,010 --> 00:18:06,960 Huh. So it's in the first design or authentic, uh, instrument of something original, 179 00:18:07,230 --> 00:18:11,430 which must serve as a model or an example of something to be copied or imitated. 180 00:18:11,430 --> 00:18:16,950 Today we can hardly find any ancient title of possession, subjugation, etc. that is original there. 181 00:18:16,990 --> 00:18:22,170 They're only or copies collated from the originals. So we've spent our originality. 182 00:18:22,170 --> 00:18:26,430 And then the most important thing, of course, in the antiquity are the cross references. 183 00:18:26,430 --> 00:18:29,940 So there's a cross reference to the originality, which we think is by the. 184 00:18:31,400 --> 00:18:37,250 And he says, uh, it's a way of doing a common thing in a singular and distinguished way. 185 00:18:37,370 --> 00:18:40,550 Originality is very rare, and most people are nothing but copy of each other. 186 00:18:41,110 --> 00:18:44,300 Uh, the title of the original is given in good and bad ways. 187 00:18:44,450 --> 00:18:50,390 One can only think of Rousseau there in the sense of an original, both good and bad. 188 00:18:50,660 --> 00:19:00,800 Uh, so this, uh, this tension between imitation and originality, between copying, compiling and eventually plagiarism is central to the project. 189 00:19:01,070 --> 00:19:06,950 Uh, uh, as the editor, zero and d'Alembert were dogged from the outset by accusations of impiety, 190 00:19:06,950 --> 00:19:13,520 improper use of authority of authorities, and flat out plagiarism that on their response to their detractors. 191 00:19:13,820 --> 00:19:18,950 In the introduction editor's introduction to the third volume, where he says, by its very nature, 192 00:19:19,340 --> 00:19:22,940 therefore the Encyclopédie must contain a large number of things that are not new. 193 00:19:23,940 --> 00:19:29,520 Among the various works at the Institute. He has been accused of borrowing from other dictionaries have been singled out. 194 00:19:30,360 --> 00:19:37,979 The resemblance that is sometimes found between an article and an article in some dictionary is forced by the nature of the subject, 195 00:19:37,980 --> 00:19:43,080 especially when the article is short and consists of only a definition of a minor historical fact. 196 00:19:43,500 --> 00:19:49,649 This is so true that most dictionaries resemble each other in a large number of articles, because they simply cannot do otherwise. 197 00:19:49,650 --> 00:19:52,770 So we are talking here about how you construct a dictionary, how you build it, 198 00:19:53,130 --> 00:19:57,030 and typically practically how you build it is you started with the list of headwinds, 199 00:19:57,240 --> 00:20:00,390 and you inherited this list of headwinds from other dictionaries. 200 00:20:00,690 --> 00:20:04,440 And it was understood that if an older dictionary had a word in it, 201 00:20:05,010 --> 00:20:08,640 you probably needed it in your dictionary whether you thought it was a good idea or not. 202 00:20:09,150 --> 00:20:16,590 And DiDio takes this head on, and this is quite possibly my favourite article in the 74,000 articles of the Encyclopédie. 203 00:20:16,800 --> 00:20:19,740 So here is the article. Egoism Zima in the dictionary how to travel. 204 00:20:19,740 --> 00:20:27,630 So the direct predecessor of the encyclopaedia, which tells us it's a plant in Brazil or the islands of South America, and that's about it. 205 00:20:28,410 --> 00:20:33,030 Diderot includes this article in the first volume, and he says he repeats it. 206 00:20:33,480 --> 00:20:39,360 He just reuses the first bit of the article. Exact word for word. It's a plant that grows in Brazil, on the islands in South America. 207 00:20:39,690 --> 00:20:44,820 This is all that we are told about it. And I would like to know for whom such descriptions are made. 208 00:20:45,180 --> 00:20:51,509 It cannot be for the natives of the countries concerned, who are likely to know more about the agua Zima than is contained in this description, 209 00:20:51,510 --> 00:20:54,900 who do not need to learn that the magazine grows in their country. 210 00:20:55,620 --> 00:21:02,069 It is. If you said to a Frenchman that the pear tree is a tree that grows in France and Germany, etc., and it's not meant for us either. 211 00:21:02,070 --> 00:21:09,300 For what do we care that there's a tree in Brazil named Zima? If all we know about is its name, what is the point of giving the name? 212 00:21:10,080 --> 00:21:14,729 It leaves the ignorant just as they were, and teaches the rest of us nothing, if all the same. 213 00:21:14,730 --> 00:21:18,790 I mentioned this plant here, along with several others that are described just as poorly then, 214 00:21:18,810 --> 00:21:23,070 as that of consideration for sun readers who prefer to find nothing in a dictionary article, 215 00:21:23,400 --> 00:21:32,660 even to find something stupid, then to find no article at all, really, uh, takes some license with this poetics of Reus to it. 216 00:21:32,670 --> 00:21:40,379 But this really gets to the heart of this notion of compilation of how you had to construct a dictionary in the 18th century. 217 00:21:40,380 --> 00:21:44,220 And so if we look at the article compiler, compiler, the compiler, 218 00:21:44,820 --> 00:21:50,490 which is by sort of ironically, one of the great texture users of the answer to the Abbe melee, 219 00:21:51,450 --> 00:21:59,040 uh, he tells us that it's a writer who does not compose anything of genius, but who is content to collect and repeat what others have written. 220 00:21:59,040 --> 00:22:05,010 Very much not originality. However, most, like lexicographers, he tells us, are merely compilers. 221 00:22:05,940 --> 00:22:07,530 And he goes on to talk about this notion. 222 00:22:07,530 --> 00:22:17,519 And so but clearly somebody we would assume zero take some umbrage with this, with this idea of lexicographers being merely compilers. 223 00:22:17,520 --> 00:22:22,290 And there is another cross-reference at the bottom, probably added by literal to plagiarist. 224 00:22:22,740 --> 00:22:28,740 And so we get finally to plagiarism, and then it's a and it's a it's defined quite clearly at the beginning as a writer 225 00:22:28,740 --> 00:22:33,840 who plunders other authors and gives their productions of being their own work, 226 00:22:33,840 --> 00:22:39,840 this notion of plundering will come back in one of our titles, which was somewhat badly taken, but it's another story. 227 00:22:41,360 --> 00:22:50,030 And then DeNiro goes on and he says. But lexicographers, at least those who deal with the arts and sciences, seem to be exempt from the law. 228 00:22:50,060 --> 00:22:51,440 The common laws of mine and yours. 229 00:22:51,680 --> 00:22:58,190 They do not claim to build their own land, nor to draw from it, from it the materials necessary for the construction of their work. 230 00:22:58,790 --> 00:23:02,989 Indeed, the character of a good dictionary, such as we would like this one to be, 231 00:23:02,990 --> 00:23:06,870 consists largely in making the use of the best discoveries of others. 232 00:23:06,890 --> 00:23:16,240 What we borrow from others, we borrow openly in broad daylight, starting from the writers from which we have drawn our status as compilers. 233 00:23:16,250 --> 00:23:23,270 And he repeats the term gives us a right and or title to take advantage of everything that we can contribute to the perfection of our purpose, 234 00:23:23,270 --> 00:23:26,450 wherever it may be, may be found. If we steal. 235 00:23:27,690 --> 00:23:31,380 It is only in imitation of the bees who gather. 236 00:23:31,800 --> 00:23:38,000 Uh, for the profit, for the public good. And this notion of the of bees, I gather it goes back to the Renaissance. 237 00:23:38,010 --> 00:23:44,999 Clearly, Montana even brings this up, as he's just a bee that's taking nectar from wherever he finds it for the better. 238 00:23:45,000 --> 00:23:45,240 Good. 239 00:23:47,850 --> 00:23:56,590 And it cannot exactly be said that we plunder authors again back to our title, but that we draw contributions from them for the benefit of letters. 240 00:23:56,700 --> 00:24:02,250 So this, then, is how plagiarism is presented in the encyclopaedia at once. 241 00:24:02,700 --> 00:24:10,470 Something that's bad, but given the right context, could be potentially quite good and in fact almost a moral duty. 242 00:24:10,920 --> 00:24:18,690 Uh, and we'll see how this evolves. Uh, Voltaire takes this definition back up in his great work of, uh, text reuse. 243 00:24:18,690 --> 00:24:21,840 The question answered in 1772. 244 00:24:22,240 --> 00:24:30,330 Uh, and he has a sort of, uh, mercantile definition of plagiarism, uh, when an author sells the thoughts of another for his own. 245 00:24:30,900 --> 00:24:33,809 This petty theft is called plagiarism. We could call plagiarism. 246 00:24:33,810 --> 00:24:38,230 All the compilers, all the dictionary makers who do not, who do no more than repeat, uh, 247 00:24:38,250 --> 00:24:43,800 opinions, errors, uh, impostors and truths already, uh, printed in previous dictionaries. 248 00:24:44,280 --> 00:24:48,970 Uh, but they are at least plagiarism in good faith. They do not claim credit for invention. 249 00:24:48,990 --> 00:24:55,229 They do not even claim to be, uh, uh, to have on earth the materials they have assembled from the ancients. 250 00:24:55,230 --> 00:25:01,050 They have merely copied in the laborious compilers of the 16th century. They sell you a quote on what you already have in folio. 251 00:25:01,440 --> 00:25:07,590 Uh, call them booksellers, if you like or not. Authors classify them as second hand writers rather than plagiarism. 252 00:25:07,650 --> 00:25:13,440 This brings to mind the work of Antoine Compagno. I'm still going to do what I do the literature. 253 00:25:14,310 --> 00:25:22,170 Finally, and most interestingly, uh, my model will take up this notion of plagiarism in the supplement copied in 1777. 254 00:25:23,160 --> 00:25:27,130 And he, uh, is quite clear that it's no great thing to be. 255 00:25:27,270 --> 00:25:30,809 It's no great tragedy to be to be a plagiarist. 256 00:25:30,810 --> 00:25:32,220 And it's a sort of literary crime. 257 00:25:32,370 --> 00:25:40,890 And it's not it's not so serious for which, uh, pedants, envious people and fools do not fail, uh, to put famous writers on trial. 258 00:25:41,220 --> 00:25:43,620 So same true today as it was then. 259 00:25:44,310 --> 00:25:50,370 Plagiarism is the name they give to a theft of thoughts, and they cry out against this theft as if they themselves were being robbed, 260 00:25:50,910 --> 00:25:57,660 or as if it were essential to public order and peace, uh, that the properties of the mind should be invaluable. 261 00:25:57,690 --> 00:26:01,620 Clearly this is before the law and modern laws on copyright. 262 00:26:01,680 --> 00:26:09,540 Anyone who brings to light, uh, either to expression or appropriateness, I thought that was not as, um, but which would be otherwise lost. 263 00:26:09,570 --> 00:26:14,390 Makes it his own by giving it a new being. For oblivion is like nothingness. 264 00:26:14,490 --> 00:26:18,510 So here, plagiarism is a positive thing, and in fact, you have to do it. 265 00:26:19,140 --> 00:26:24,960 You have a right to a posterity to reuse others in public law. 266 00:26:24,990 --> 00:26:30,600 Uh, ownership of land is conditional and it not being cultivated on it. 267 00:26:30,840 --> 00:26:33,560 It's being cultivated by the owner. Uh, uh. 268 00:26:34,650 --> 00:26:39,930 And if the owner let it follow the society, which would have the right to demand, uh, that he give it up or sell it. 269 00:26:40,380 --> 00:26:47,280 The same is true of literature. Uh, he who has seized the happy and fruitful idea, uh, and does not develop it, 270 00:26:47,280 --> 00:26:54,450 leaves it as a common property for the first occupant who will know better than he does how to develop its richness. 271 00:26:54,870 --> 00:27:01,229 So this is the sort of notion of plagiarism that we've we've taken on this notion of text reuse as a productive and in fact, 272 00:27:01,230 --> 00:27:11,160 almost moral imperative in the 18th century, which clearly will be, uh, destroyed in the laws of 1791, 1793 of copyright and the right of the author. 273 00:27:11,340 --> 00:27:15,900 But for most of the of the 18th century, this sort of free and open, uh, 274 00:27:16,170 --> 00:27:22,400 plagiarist text reduces, is is operative, and nowhere more than in this in the antiquity. 275 00:27:23,280 --> 00:27:24,810 And it follows Diderot. 276 00:27:25,200 --> 00:27:33,059 d'Alembert continues in his advertisement from these reflections that the committee must often contain either extracts or even sometimes in full, 277 00:27:33,060 --> 00:27:40,680 several pieces of of the best works of each genre. It is only important to the public that the choice be made with clarity in economy. 278 00:27:41,190 --> 00:27:48,780 Uh, but it is also important for the authors to quote the originals accurately, both to enable the reader to consult them and to give each his due. 279 00:27:49,080 --> 00:27:52,830 This is how several several of our colleagues have done it. 280 00:27:53,040 --> 00:27:59,760 We wish that they had all done so. But in any case, uh, when an article is well done, it is equally enjoyable to read it. 281 00:28:00,150 --> 00:28:02,490 Uh, whichever form, whichever hand it comes. 282 00:28:02,920 --> 00:28:10,350 Uh, and the disadvantage of, uh, failing to quote which is always great in relation to the author, it's much less so in relation to the dictionary. 283 00:28:10,780 --> 00:28:16,050 Let's bring this back to this, sort of this, this sort of provoked us years ago to, to to think about how, 284 00:28:16,410 --> 00:28:22,170 how we can find these the quotations, uh, which were marked or sometimes not marked. 285 00:28:22,650 --> 00:28:27,180 Uh, and these different reuses in the encyclopaedia and how we could put the answer properly, 286 00:28:27,180 --> 00:28:33,719 which is the huge text, which is 74,000 article articles, 28 volumes in folio on this similarity spectrum. 287 00:28:33,720 --> 00:28:41,130 To come back to this. So somewhat and we did tests and an information retrieval and similar article, uh, comparisons. 288 00:28:41,550 --> 00:28:45,390 And we felt we tried plagiarism detection. It was a bit too strict. 289 00:28:45,390 --> 00:28:53,420 And so this is how all of. This is the sort of preface to how we got to investigating automatic text reads, and we published two articles. 290 00:28:53,430 --> 00:29:00,030 Again, I mentioned the first in Plundering Philosophers. And now you see where we get this title from, uh, in 2010, uh, 291 00:29:00,030 --> 00:29:05,100 which experimented with all these different sorts of, uh, of, of, uh, information retrieval. 292 00:29:05,550 --> 00:29:08,240 And then there's something borrowed which is more or less, uh, 293 00:29:08,250 --> 00:29:14,820 outlining how sequence alignment, uh, an approach to text reuse can be particularly fruitful, 294 00:29:15,330 --> 00:29:24,840 uh, for us, uh, and these are slides that I've, uh, in true text reuse manner has reused for at least 15 years. 295 00:29:26,340 --> 00:29:33,030 Uh, the investigation of these sorts of relationships begins with identification of, of, of similar passages using sequence alignment, 296 00:29:33,030 --> 00:29:39,750 which is a, which is a technique that's, that's used in other domains, uh, to find different strings of similarity. 297 00:29:39,990 --> 00:29:46,410 Uh, so it's used a lot in bioinformatics to think about similar DNA sequences, a plagiarism detection, which I've mentioned. 298 00:29:46,710 --> 00:29:55,200 And then inversely in text collation. So looking for uh, zones of, of uh variants into light traditions. 299 00:29:55,560 --> 00:30:03,450 Uh, we were inspired Michelsen in particular an article at the time, it was inspired by the bioinformatic approach, uh, of sequence alignment. 300 00:30:04,050 --> 00:30:05,430 This is still in use today. 301 00:30:05,790 --> 00:30:12,780 Uh, colleagues in Helsinki are here that use the Blast algorithm, which comes directly out of, uh, the bioinformatic, uh, world. 302 00:30:13,230 --> 00:30:20,820 Uh, at artful, we've developed Mark Olsen and now Carlos Gladstone, who is here, our own system called, uh, text pin. 303 00:30:21,300 --> 00:30:24,660 This is as technical as I'll get today. It's not very technical at all. 304 00:30:24,720 --> 00:30:28,830 Uh, but this is it. Give you an idea of what tax payer does, what the system does. 305 00:30:29,190 --> 00:30:36,569 Uh, we take, uh, large databases, we pre-treat them, turn them in what's called n-gram. 306 00:30:36,570 --> 00:30:43,559 So groups of words, uh, we look for the common sequences of these ngrams, uh, and then we adjust parameters. 307 00:30:43,560 --> 00:30:47,040 There's a high flexibility. Again, plagiarism detection was a bit too strict. 308 00:30:47,040 --> 00:30:53,159 What we're interested in, especially in my new project, is, uh, zones of, uh, where the same text is there, 309 00:30:53,160 --> 00:30:59,730 but there might be changes or errors or things, you know, you can jump out of. So this, this idea of, of, uh, flexible parameters is very important. 310 00:31:00,540 --> 00:31:06,269 And so if you take the first two lines of Rousseau's Social contract, learning about to read only fair, uh, 311 00:31:06,270 --> 00:31:14,610 this is how we pre-treat the, the, uh, the document, uh, it becomes omnibus to leave about to fair, uh, you can see how this works. 312 00:31:14,760 --> 00:31:21,780 The idea being that, um, Libor part two, uh, if shared between two documents that are not the same document, is a fairly rare event. 313 00:31:22,080 --> 00:31:26,489 Uh, and given the rarity of this event, the algorithm will then think about, uh, 314 00:31:26,490 --> 00:31:30,690 are there other n n-gram shared either to the left or right in a sliding window that we define? 315 00:31:31,380 --> 00:31:37,080 Uh, so this is a deceptively straightforward, uh, notion of intertextuality. 316 00:31:37,080 --> 00:31:40,200 It's the same set text, more or less in two different places. 317 00:31:40,530 --> 00:31:47,640 Uh, but you'll as you'll see today, the range of text we use, uh, the detection of reuses was only the first step. 318 00:31:47,790 --> 00:31:55,320 And in fact, it's understanding how these reuses are employed and deployed in the 18th century that it becomes really interesting. 319 00:31:55,770 --> 00:32:05,550 So, for example, there's no real surprise that. The answer cites or uses that it was a known predecessor of the non-security. 320 00:32:06,270 --> 00:32:10,230 And we find a very long segment at the end of the article on Spinoza. 321 00:32:10,500 --> 00:32:17,700 But what is surprising is that at the end of the article, the event finishes in Bell's voice. 322 00:32:18,330 --> 00:32:21,930 So he concludes the article in a funny budget. 323 00:32:22,230 --> 00:32:25,530 So there's no quotation marks. Bell is mentioned at the beginning. 324 00:32:25,860 --> 00:32:32,519 Uh, so there's something going on here, but it's a way of sort of putting Bailey in the anti-gravity. 325 00:32:32,520 --> 00:32:36,720 His voice becomes part of the of the authorial fingerprint of the committee. 326 00:32:37,260 --> 00:32:44,190 So he concludes the article whether or not the readers of the time knew this remains to be seen, but for us, 327 00:32:44,190 --> 00:32:51,090 it was a clear indication that we had no idea what was going on here until the computer, uh, found this match. 328 00:32:52,080 --> 00:32:54,809 It can do. We've we found this with lots of different work. 329 00:32:54,810 --> 00:32:58,950 So, for example, the idea of using lock in the encyclopaedia is really, really interesting. 330 00:32:59,250 --> 00:33:06,150 Uh, and we can do this using, uh, using text from, from echo, uh, which is a translation obviously. 331 00:33:06,540 --> 00:33:10,770 Uh, and using lock in the encyclopaedia was a bold move. 332 00:33:10,890 --> 00:33:14,280 Uh, lock second treatise was in France banned and burned. 333 00:33:14,640 --> 00:33:15,600 It was on the index. 334 00:33:15,840 --> 00:33:23,220 So if you said I'm going to say lock, uh, chances are the censor would come and say, you know, either take it out or you'll lose your privilege. 335 00:33:23,700 --> 00:33:28,799 Uh, but, uh, is a man possessed, and he says, I'm going to use lock. 336 00:33:28,800 --> 00:33:30,480 I just won't tell anybody what's going on here. 337 00:33:30,480 --> 00:33:38,280 And so all through the article, uh, government is lots and lots of lock, uh, with no indication whatsoever, uh, that this is where it comes from. 338 00:33:38,310 --> 00:33:46,650 Uh, and again, it's more or less it's not a copy. Equally, it's not, uh, that he's taking a bit of lock and just pasting it in for an article. 339 00:33:46,650 --> 00:33:50,640 It's that he's weaving it into his own articles to support his own argument. 340 00:33:51,060 --> 00:34:00,960 And it's it's very much how text reuse worked in, uh, in the 18th century as a way of generating meaning through the use of other models. 341 00:34:01,380 --> 00:34:05,880 This, uh, caused us to think about the status of different works. What works could you cite in the notes you prepared? 342 00:34:06,150 --> 00:34:10,410 Couldn't you could cite the only add, for example, was less controversial. 343 00:34:10,770 --> 00:34:17,850 Uh, but you could never cite we shouldn't ever cite the, uh, Latrobe philosophy because, again, like, good luck, it was abandoned, burned. 344 00:34:19,050 --> 00:34:24,840 And then there was a whole sort of spectrum of dangerous works that would make their way into obscurity. 345 00:34:25,020 --> 00:34:30,149 It was really stupid to think about citing The Last Plague, because the list from 11 issues, in fact, 346 00:34:30,150 --> 00:34:35,549 causes the Nazi Germany to lose its privilege, uh, because of its association with materialism. 347 00:34:35,550 --> 00:34:45,540 But once again, so cool. Uh, he, uh, is quite insistent, so he uses a bit of, uh, the recipe in his article on, uh, the, uh, mine. 348 00:34:45,840 --> 00:34:49,169 Uh, but he gives a tip of the hat here. He says to people, oh, here it is. 349 00:34:49,170 --> 00:34:56,579 There's no there's no quotation marks. Uh, there's just this passage and he says, see, decomposition in the social class. 350 00:34:56,580 --> 00:35:02,100 So it's up to the reader to understand who is this Bush-Cheney? Uh, what's going on? 351 00:35:02,610 --> 00:35:06,120 Is it? And would they recognise this is Eldridge shows. Not so sure. 352 00:35:06,450 --> 00:35:09,780 Uh, it's true. There's a whole system of coding that goes on the property. 353 00:35:10,080 --> 00:35:15,150 So at least on philosophies Voltaire, Lupu, Euclid's, Montesquieu. 354 00:35:15,450 --> 00:35:18,389 Uh, there's lots of different other ways of referring to people. 355 00:35:18,390 --> 00:35:24,870 But this is, again, a case where perhaps the readers of the 18th century saw what was going on. 356 00:35:25,500 --> 00:35:35,160 So short readers today certainly do not some specialist maybe, uh, but it's a real area where, uh, the computer can bring us to new knowledge. 357 00:35:35,520 --> 00:35:38,430 And this was published it, of course, with Dana Justin and Robert Marcy. 358 00:35:38,670 --> 00:35:42,330 And then we continued along this vein, thinking about hidden voices in The Secret, 359 00:35:42,510 --> 00:35:45,720 which is a polyphonic work, as you saw with the sort of incorporation of Bell. 360 00:35:46,170 --> 00:35:52,380 Uh, maybe we know those 140 authors, but maybe there's many, many, many more that we just don't know about. 361 00:35:52,440 --> 00:35:55,350 Uh, and one of these, of course, was was Emily Chatelain. 362 00:35:56,130 --> 00:36:03,780 And somebody knew that the article was, in fact, drawn more or less from her, uh, instead situated on the physique. 363 00:36:04,230 --> 00:36:10,800 And so we digitised or we took a digitised version of the institution, the physique, and we compared it to the encyclopaedia. 364 00:36:11,370 --> 00:36:16,769 And what was it? It was really interesting. We found, I think, 13 articles with at least some Emily Chatelet. 365 00:36:16,770 --> 00:36:23,940 She is, of course, not an author. There's only one female author in the annals seem to be any and Darrow refers to as an anonymous woman. 366 00:36:24,150 --> 00:36:28,470 It's a bit sad. We think it's called wife. That's an article on fashion. 367 00:36:29,010 --> 00:36:35,100 Uh, that's the only woman in the list. But nonetheless, you can see that some of these articles and we've given them the scores. 368 00:36:35,100 --> 00:36:40,140 And how much Du Chatelet is in there. Contradiction is 95% du Chatelet. 369 00:36:40,890 --> 00:36:46,140 Uh, and the reason that form is mentioned everywhere is that there was this guy named Senator for me, 370 00:36:46,650 --> 00:36:51,930 was the secretary of the Academy of Berlin, and he had he was going to make his own dictionary. 371 00:36:52,200 --> 00:36:55,259 And then he heard about delivering that on Bell, and he got discouraged. 372 00:36:55,260 --> 00:36:58,470 And so he gives his papers to that on there. He says, do what you want with them. 373 00:36:58,770 --> 00:37:04,009 But if. He just cribs and steals from me today in many places and so on. 374 00:37:04,010 --> 00:37:07,400 Bill doesn't quite know that what's going on, and so he puts this into the answer, Billy. 375 00:37:07,670 --> 00:37:13,310 But in fact all of these articles continue, for example, then on verse as these come from the papers of Samuel Fournette. 376 00:37:13,790 --> 00:37:19,999 But it's a media shack today. Uh, so this sort of gives us this notion of, uh, there are these articles that should rightly, 377 00:37:20,000 --> 00:37:24,290 somehow or we should somehow mention them today as being at least, 378 00:37:24,710 --> 00:37:34,610 uh, a co-author, uh, with, uh, with these others and then, uh, recently or somewhat recently, uh, we thought about translation as read. 379 00:37:34,610 --> 00:37:42,500 So if, you know, sort of the history of the institute, it started out as a translation project of the English Cyclopaedia, Ephraim Chambers, 380 00:37:42,680 --> 00:37:50,930 and then eventually quickly, uh, it got out of hand and they decided, well, this is it's going to be much more than just a translation. 381 00:37:51,110 --> 00:37:56,330 But they had this, this block of text, of translations of articles that were started by different people. 382 00:37:56,720 --> 00:38:00,530 Uh, and we're never quite sure where they occur in the security. 383 00:38:00,740 --> 00:38:09,290 There are articles that have a mention, uh, and here Paolo says that many times they left these articles and just just used them as filler, 384 00:38:09,740 --> 00:38:12,740 uh, almost unchanged without significant modifications. 385 00:38:12,960 --> 00:38:17,600 Uh, but there's never been a sort of systematic undertaking of how to find these articles. 386 00:38:17,810 --> 00:38:22,890 Some are marked with this and, uh, and reference to chambers. 387 00:38:22,910 --> 00:38:31,220 There's about 1100 of these. And, uh, this is a good article, uh, a guide to the law by women. 388 00:38:31,520 --> 00:38:40,030 And you can see, uh, what we wanted to do was somehow draw this connection between, uh, the chambers at the bottom and the, uh, at the top. 389 00:38:40,040 --> 00:38:46,840 And at the time, I think it's 2018. 2019. It was before these great multilingual language models. 390 00:38:46,850 --> 00:38:53,899 I'm sure now today we could do it much more easily. Levels probably have different ways, but we did it in a in a pretty hand-cranked way. 391 00:38:53,900 --> 00:38:59,000 As we said, as we digitised the 1742 chambers, which was the chambers that Darrow owned. 392 00:38:59,480 --> 00:39:08,690 Uh, we ran it through Google Translate automatically, and we kept the structure and we built a virtual dictionary, a sort of fake French chambers. 393 00:39:09,020 --> 00:39:13,790 And with that French chambers, we ran the sequence alignment against it to find articles. 394 00:39:13,790 --> 00:39:22,399 So, for example, an accuracy comes back and we had to really, uh, get fine grained on our comparisons because obviously it's a modern translation. 395 00:39:22,400 --> 00:39:26,660 So for example, it totally has no idea what a petticoat government might or might not be. 396 00:39:26,960 --> 00:39:30,650 It's a uppity motto. It makes no sense whatsoever. 397 00:39:30,650 --> 00:39:34,430 It's a terrible translation, but there's enough. Uh n-gram. 398 00:39:34,430 --> 00:39:37,700 There's enough content in there that we're able to to get a match. 399 00:39:38,000 --> 00:39:43,310 Uh, and again, uh, it was already signed by chambers, uh, and we only found about 800 of the 1100. 400 00:39:43,970 --> 00:39:46,910 So it could be due to translation, could be due to lots of different things. 401 00:39:47,180 --> 00:39:52,520 But we found like 2500 more articles that had no mention to chambers whatsoever. 402 00:39:52,700 --> 00:39:58,760 So, for example, a friend, the MLA, writes the article demoniac, which is a word for word, 403 00:39:59,060 --> 00:40:03,410 uh, the same amount of paragraphs, the same more, uh, cross-references that chambers. 404 00:40:03,620 --> 00:40:07,880 It's taken clearly directly from chambers. And he signs his name to it. 405 00:40:08,510 --> 00:40:11,730 So again, is this the chambers article, or is this the animal? 406 00:40:11,750 --> 00:40:14,959 It's, uh, moving on. 407 00:40:14,960 --> 00:40:20,180 This is sort of to give you a very synthetic view of how we envision text reuse detection. 408 00:40:21,970 --> 00:40:29,910 We're progressively moving out of the city rapidly and into the sort of larger communication world of of print culture in the in the 18th century. 409 00:40:29,920 --> 00:40:32,799 This is Robert Darwin's famous communication network. 410 00:40:32,800 --> 00:40:37,450 And the idea here is that information flows in the 18th century through all sorts of different channels. 411 00:40:38,020 --> 00:40:44,380 Oddly enough, correspondence doesn't show up on his communication network, which I have always found a bit odd. 412 00:40:44,740 --> 00:40:50,290 Uh, but again, this goes back to our time here. Uh, my time here in Oxford with Nicholas, one of the, 413 00:40:50,290 --> 00:40:56,260 one of the first ideas we had was electronic enlightenment correspondence in general should be really rich for text for years. 414 00:40:56,270 --> 00:41:04,000 There should be some sort of platform in which text for use, uh, is filtered through this information space that is at one time, 415 00:41:04,360 --> 00:41:11,079 uh, sociable, but also, uh, a way of diffusing information, a way of building literary authority and otherwise. 416 00:41:11,080 --> 00:41:17,680 So we all know electronic enlightenment. And you give you an example of how, uh, this sort of thing can work with poor correspondence. 417 00:41:17,980 --> 00:41:22,090 We can take an event like the the earthquake in Lisbon in 1755. 418 00:41:22,090 --> 00:41:29,950 It happens in November 1st. And we know that Voltaire gets news of this, gets wind of this, and it sort of shakes his faith in. 419 00:41:31,320 --> 00:41:35,810 In, uh. In an optimistic worldview in divine providence. 420 00:41:36,290 --> 00:41:43,429 And he writes upon and. And we find traces of the Pope already in January 1756. 421 00:41:43,430 --> 00:41:51,860 So it's very fast in terms of early modern, and it's just a little bit and it's, uh, uh, someone that's right to run. 422 00:41:51,860 --> 00:41:57,680 Hello. Who says while I was in his presence and he recited this poem and I have these last eight verses, 423 00:41:58,110 --> 00:42:03,830 uh, and it's a very funny letter, uh, because he says he wrote this poem. 424 00:42:04,220 --> 00:42:09,560 I was sent the eight verses which enclose here. I have reason to believe that it is in his own admission. 425 00:42:09,560 --> 00:42:16,340 It seemed to me these were a little materialistic things, and I've replied so. 426 00:42:16,400 --> 00:42:24,740 But there's no indication here. I've replied with a few fragments of my own translation of Van Haller's, uh, work on, uh, the origins you met. 427 00:42:25,190 --> 00:42:27,829 Uh, so this is a really a sort of Frankenstein poem here. 428 00:42:27,830 --> 00:42:37,460 That is eight lines of Voltaire, eight lines of, uh, translation of, uh, of Eisenhower's work, which he does directly into, into the correspondence. 429 00:42:37,850 --> 00:42:39,739 So this is this poetics of text reuse. 430 00:42:39,740 --> 00:42:46,700 That's, that's been and if we compare the eight lines by Voltaire, we can see this is, uh, we talked about this this afternoon, 431 00:42:46,700 --> 00:42:49,909 this idea of at some point you get to the end point of these reuses, 432 00:42:49,910 --> 00:42:53,719 you want to see what the what the, the differences are rather than the similarities. 433 00:42:53,720 --> 00:42:57,410 And in this case, the last two lines don't make it into the final cut. 434 00:42:58,220 --> 00:43:08,440 And because we're using modern, uh, modern text from the Voltaire Foundation, uh, we wouldn't get this end of this, this couplet comfort, 435 00:43:08,450 --> 00:43:16,670 you know, maternal mortality for Soufriere to Sumatra, where the Voltaire thought this was a bit depressing and he doesn't continue on with it. 436 00:43:16,700 --> 00:43:24,650 Uh, but it's a way of recreating this, this sort of notion of his creating through the correspondence that was creating, uh, different ways. 437 00:43:24,820 --> 00:43:28,940 He's retouching his work, and eventually the sequences get longer as we go on. 438 00:43:28,940 --> 00:43:38,629 And so by March, uh, it's much longer. And it's another letter to Von Harlow where he says, I was in the present, and he gave us that road to shake. 439 00:43:38,630 --> 00:43:45,220 So we know it's a retouched piece. And again, here we can show, uh, that more or less it's the same version, right? 440 00:43:45,230 --> 00:43:50,530 There's a few differences. Providence and presence has been changed to prime, uh, live. 441 00:43:50,570 --> 00:43:55,660 But again, this this can tell you how the correspondence is really interesting to see different states of, 442 00:43:55,750 --> 00:44:01,970 of a work which don't quite make it into editions or might not make it into conditions, but good ones, they work. 443 00:44:03,410 --> 00:44:10,760 Uh, finally. Uh oh. And then we know as a sort of, uh, media event, somebody sends a letter to Russo and says, 444 00:44:10,760 --> 00:44:14,750 what, are you going to let this guy question divine providence? 445 00:44:15,110 --> 00:44:21,259 And, of course, Rousseau was always up to the challenge. And he responds with this really long letter, which is about 6000 words, uh, 446 00:44:21,260 --> 00:44:27,890 which gets printed as on optimism and, and, and and sort of, uh, flies all over you. 447 00:44:28,670 --> 00:44:31,520 Voltaire was not very happy about this, that his letter was printed. 448 00:44:31,940 --> 00:44:40,730 Uh, and so we can only assume that one of the responses to this in the same year that on optimism was, was, uh, published is probably content. 449 00:44:40,880 --> 00:44:48,800 So, again, this way of thinking of, uh, reuse as a way of, uh, of augmenting or understanding, uh, cultural dissemination. 450 00:44:49,610 --> 00:44:56,989 And also, uh, I'd be remiss not to mention just the way that Voltaire reads and reads with his correspondence, 451 00:44:56,990 --> 00:45:02,240 which is really he was like a Reader's Digest. Uh, and he loved to exhibit works, and he send it to his friend. 452 00:45:02,240 --> 00:45:10,880 So here is another added distance works. It's published posthumously, 1771, uh, or 1773. 453 00:45:11,000 --> 00:45:17,569 Here Voltaire is taking little snippets out of it, sending it to, uh, that on there and saying, oh, don't you think this is great? 454 00:45:17,570 --> 00:45:25,340 It's too bad. It's actually a pretty bad book. Uh, he does the same thing for Mademoiselle, the font, uh, which is nice, because she's blind. 455 00:45:25,580 --> 00:45:30,200 Uh, so he's sort of saying, here, I'm giving you all the good parts. 456 00:45:30,560 --> 00:45:33,740 Don't waste your time with the book. It's a bad book. 457 00:45:34,040 --> 00:45:35,240 But I've given you the good things. 458 00:45:35,540 --> 00:45:41,149 And he says, I take these little diamonds at random, and there are thousands of this tastes whose brilliance struck me. 459 00:45:41,150 --> 00:45:43,430 That does not prevent the book from being very bad. 460 00:45:43,820 --> 00:45:50,360 I spend my life looking for precious stones in manure, and when I find the sum, I put them aside and make a profit from them. 461 00:45:50,750 --> 00:45:58,129 That's why bad books are sometimes very useful. So here, in a nutshell, is this sort of text reuse poetics of the 18th century. 462 00:45:58,130 --> 00:46:06,320 So this is notion of somehow you accrue through this dissemination of, of different of both the act of extracting, uh, 463 00:46:06,320 --> 00:46:12,770 these precious diamonds from books, but also sending them out so we can't pick it up unless somebody sends it out. 464 00:46:12,770 --> 00:46:17,210 So text reuse can only help you in this moment that somebody reuses the text again. 465 00:46:17,660 --> 00:46:20,930 Um, but again, it's a whole question. 466 00:46:20,930 --> 00:46:24,379 So we have writers notebooks, for example. So Voltaire knows he has his notebooks. 467 00:46:24,380 --> 00:46:29,990 So these are the things that he writes down, these these quotations of how what's the moment that the quote, 468 00:46:29,990 --> 00:46:38,899 he decides the quotation is going to move out into, into the world. And so all of this, to move finally to the new, uh, project we have now, which is, 469 00:46:38,900 --> 00:46:45,980 which is to think precisely about, uh, this whole sort of network or this whole sort of realm or cultural system. 470 00:46:46,400 --> 00:46:51,379 Uh, that is the 18th century based on reuses. So we want to think about intertextual networks. 471 00:46:51,380 --> 00:46:55,730 Don't worry too much about this network. They all sort of look like giant hairballs when you do it. 472 00:46:55,730 --> 00:47:03,410 But Voltaire is always in the middle just because, uh, as I said, he was a real texter user. 473 00:47:03,680 --> 00:47:07,790 Uh, and this project, uh, called modern that Nicholas mentioned, uh, 474 00:47:07,790 --> 00:47:13,279 is based on this notion that perhaps we can think about the exchange of text we use. 475 00:47:13,280 --> 00:47:21,470 So not just identifying the reuses, that's the first step, but thinking about how the reuses travel and are used and reused, uh, through networks. 476 00:47:21,680 --> 00:47:29,300 And so borrowing from work from Danielle Stein, who started looking into networks at Stanford with the Mapping the Republic of Letters project, 477 00:47:29,720 --> 00:47:39,860 more recently, thinking about social network analysis as a as a way of understanding cultural exchange, uh, through different, different areas. 478 00:47:39,860 --> 00:47:46,100 And here, Ruth and Sebastian Anand have been instrumental in promoting this approach in, in the humanities. 479 00:47:46,580 --> 00:47:51,530 Uh, they have a wonderful new book out I recommend to everyone called Tutor Networks of Power. 480 00:47:52,310 --> 00:48:02,420 There were also, of course, uh, an Oxford Studies collection by Dan and, uh, Chloe Edmunds and thinking about how to use social and network analysis. 481 00:48:03,900 --> 00:48:11,100 For the humanities. And a good primer. Again, it's by Ruth and Sebastian with with Katherine Coleman and Scott Weingarten at work. 482 00:48:11,100 --> 00:48:13,050 Turn. For my own part. 483 00:48:13,350 --> 00:48:23,610 To keep this notion of theorising, uh, into it, I, uh, I add a sort of extra layer of what Bruno Latour calls the actor network theory. 484 00:48:23,610 --> 00:48:30,780 And here, the way we think about this with the networks that we want to build, uh, the two advantages of actor network theory. 485 00:48:31,110 --> 00:48:35,489 One is that, uh, it's it's, uh, there's no presuppositions. 486 00:48:35,490 --> 00:48:41,879 There's no, uh, a priori, uh, you simply generate these networks and you try to trace where the data takes you. 487 00:48:41,880 --> 00:48:45,120 So it's very online with what we call data driven research. 488 00:48:45,360 --> 00:48:51,809 You build the data sets, you run the algorithms over it, what comes out then you try to make sense of, uh, 489 00:48:51,810 --> 00:49:00,840 and also within this sort of intellectual framework of, uh, actor network theory, actors can be anything they need, not be human beings. 490 00:49:01,290 --> 00:49:05,550 Uh, they can be books, they can be texts, they can be snippets. Uh, they can be speed bumps. 491 00:49:05,570 --> 00:49:11,040 There's another Tor tells us it's anything that affects the behaviour or the predicted behaviour of a network. 492 00:49:11,430 --> 00:49:17,190 Uh, and another important takeaway from this is that sometimes it's not the originators nor the receivers, 493 00:49:17,460 --> 00:49:22,320 but it's the mediators that are in fact the most important nodes in a network. 494 00:49:23,610 --> 00:49:26,040 Uh, but to do this, of course, we have to have a lot of data. 495 00:49:27,070 --> 00:49:34,209 And this is what we've been doing for the past couple years, compiling lots and lots of data as much as we can get in the 18th century realm, 496 00:49:34,210 --> 00:49:36,760 with the only caveat being that we don't digitise anything. 497 00:49:37,300 --> 00:49:45,010 So the idea is that digitised collections are significantly sufficient now that we should use it, 498 00:49:45,010 --> 00:49:50,810 and that because, yes, there are lots of errors and because, yes, they come from OCR, uh, there are errors. 499 00:49:50,830 --> 00:49:57,430 We should nonetheless use them. So we've taken uh, a mixture of, uh, corrected works, which we call canon. 500 00:49:57,430 --> 00:50:02,290 So the, the distinction is only qualitative and archive, which are OCR. 501 00:50:02,860 --> 00:50:12,040 So we're about 12,400 books. Uh, in this, in this main collection, which we will then compare iteratively through pamphlets, 502 00:50:12,040 --> 00:50:16,359 letters, dictionaries and finally the press which we're working on with the BNF. 503 00:50:16,360 --> 00:50:23,829 Uh, and the notion here is to think about to that communication network of Darnton, uh, that it's not just books, uh, 504 00:50:23,830 --> 00:50:29,650 that are important in the 18th century, but how they move through things like pamphlets, letters, and especially the press. 505 00:50:30,070 --> 00:50:35,110 Very interesting to have a think about, uh, just scale. 506 00:50:35,380 --> 00:50:40,990 I think now we have some notion that the press will be about 300,000, 300 million words. 507 00:50:41,380 --> 00:50:44,800 So we're up to a billion words, which is completely arbitrary. 508 00:50:44,800 --> 00:50:48,670 But it's what I said in my, uh, grant proposal that we get a billion words. 509 00:50:49,180 --> 00:50:56,890 But the idea being that with a billion words, you can build your, uh, your own data, uh, your own large language model for the 18th century. 510 00:50:57,220 --> 00:51:01,870 So we can think about either fine tuned one of them, but maybe we can ask Voltaire some questions, 511 00:51:02,410 --> 00:51:06,250 uh, through the language model that we built nonetheless. 512 00:51:06,280 --> 00:51:11,079 Uh, so what? Who cares? This is the question I tell my students. 513 00:51:11,080 --> 00:51:15,310 Just put it in. So what? So what? Uh, what can we do then? 514 00:51:15,580 --> 00:51:19,569 We have all this data. We're going to run these alignments. We're going to we're going to have to filter them. 515 00:51:19,570 --> 00:51:22,840 But that's a whole that's a question. That's a story for another time. 516 00:51:23,140 --> 00:51:26,469 Uh, the first thing we want to do is just simply supplement traditional methods. 517 00:51:26,470 --> 00:51:30,910 It's just new ways of, uh, of doing what historians have always done, 518 00:51:31,750 --> 00:51:38,250 but on steroids and in some sense, in a way that's, uh, both faster and perhaps more comprehensive. 519 00:51:38,260 --> 00:51:43,390 So it will contain a the modern database will contain a great number of these intertextual reuses, 520 00:51:43,810 --> 00:51:50,800 make it possible to sort of follow the fortunes of inter text or concepts as they move through this information space. 521 00:51:50,920 --> 00:51:55,959 This is very much the sort of conservative side of intertextuality come from Harold Bloom. 522 00:51:55,960 --> 00:52:03,820 So to take one example, we have, uh, Barlow, uh, who writes, uh, reading a book would have laid over all of them. 523 00:52:04,630 --> 00:52:08,290 But this is very much an aesthetic judgement. It's talking about verisimilitude. 524 00:52:08,680 --> 00:52:20,440 It's about how, uh, how a text, a fiction, uh, can has to be, has to be sort of true for people to embrace it. 525 00:52:20,950 --> 00:52:24,520 Uh, and you can sort of follow it as it moves through the 18th century. 526 00:52:24,530 --> 00:52:28,359 Uh, and so cognac brings it up. But in speaking about imagination. 527 00:52:28,360 --> 00:52:32,110 So it's still very much the aesthetic, this notion that it's still verisimilitude, 528 00:52:32,110 --> 00:52:36,480 but he's talking about the imagination and how the imagination forms and fables and fictions, etc. 529 00:52:36,700 --> 00:52:40,749 Uh, by the time you get to Samuel Johnson, the Rebecca, he repeats it. 530 00:52:40,750 --> 00:52:43,900 He has quotation marks. It's good, but he undermines completely. 531 00:52:43,910 --> 00:52:49,170 Barlow and he says, ran a book on a railway racer, the demon quokka, Cedric Campbell. 532 00:52:49,180 --> 00:52:52,930 Very true. So even though the person that wrote it never wrote a word of truth. 533 00:52:53,440 --> 00:52:59,899 Uh, and then by the time we get to the revolution. This is an interesting text, which I'll talk about here shortly, called The Impulse Electric. 534 00:52:59,900 --> 00:53:07,380 You can, uh, you can see that. There's no indication at all that it's Bollo, so no idea that's what the text is there. 535 00:53:07,620 --> 00:53:11,070 But just by its context, you can see that it's changed into a moral category. 536 00:53:11,490 --> 00:53:15,570 Uh, this true? This is very different from the beginning. 537 00:53:16,020 --> 00:53:20,400 From what? And so which by this time it's changed completely in its valence. 538 00:53:21,340 --> 00:53:24,370 So that's just one example of what you could do with our data set. 539 00:53:24,700 --> 00:53:28,840 Secondly, this gets to what Nicholas was mentioning early earlier. 540 00:53:29,200 --> 00:53:35,500 In a sense of distant reading, we hope we can really sort of remap literary the literary history of the 18th century 541 00:53:36,190 --> 00:53:40,239 using these sort of social network analysis and how they move through the different thing. 542 00:53:40,240 --> 00:53:48,220 And this sort of presupposes a sort of radical, uh, notion of intertextuality, where everything is in an inter text. 543 00:53:48,580 --> 00:53:55,930 And in fact, it's more it's more about this process of dissemination that tells us, uh, not of reproduction, but of productivity. 544 00:53:55,930 --> 00:54:04,780 So an inter text or a cluster of inter texts is a productive force, uh, sometimes without an author, without any reference to intentionality. 545 00:54:04,780 --> 00:54:08,229 It's just these reuses that sort of move through these texts. 546 00:54:08,230 --> 00:54:17,680 And so we have to sort of think about different ways of thinking about authorship, how we understand authors in this, in this, in this text. 547 00:54:17,680 --> 00:54:21,990 And for us, it's just one attribute, uh, of a text object. 548 00:54:22,000 --> 00:54:26,080 So the text is primordial for us, the text that circulates, that moves its way through. 549 00:54:26,590 --> 00:54:32,860 If we know the author, great. If we don't, it's all right. Uh, we hope to find it at some point. 550 00:54:33,460 --> 00:54:40,120 And these are attributes that can be derived from other relationships between texts, objects, and then an author object. 551 00:54:40,120 --> 00:54:43,620 If we talk about author objects in our networks, uh, 552 00:54:43,630 --> 00:54:48,670 it's the result of the combination of all these different characteristics attributed to it by the relations between the text objects. 553 00:54:48,670 --> 00:54:56,200 So this is a sort of, uh, an author's important, uh, not because of what we read about literary history, 554 00:54:56,200 --> 00:55:01,840 but it's in their place in the network and their sort of lines of force as we users. 555 00:55:03,180 --> 00:55:08,309 This also causes us to think about what the text, uh, very prosaically, for us, 556 00:55:08,310 --> 00:55:14,250 the text is the sort of concordance of at least four similar trigrams, uh, that follow within a defined interval. 557 00:55:14,640 --> 00:55:17,880 Uh, but this poses all sorts of questions. And, uh. 558 00:55:19,600 --> 00:55:27,850 All reuses are not equal. So for example and this comes from Dario Nicolas, you found this great example of Marat during the revolution, 559 00:55:28,270 --> 00:55:32,310 uh, reuses whether consciously, whether on purpose or not, we don't know. 560 00:55:32,320 --> 00:55:36,970 It doesn't really matter. Uh, fentanyl. So you would never join these two together? 561 00:55:37,330 --> 00:55:40,900 Uh, he changes it a bit. New becomes Diego. 562 00:55:41,380 --> 00:55:46,320 Uh, but the context is completely different. I mean, it just couldn't be clearer. 563 00:55:46,330 --> 00:55:49,690 And so here we are, like, this is what we want. This is exactly what we want. 564 00:55:50,290 --> 00:55:54,280 But nonetheless, in the bottom here and the the titles are the same. 565 00:55:54,790 --> 00:55:58,510 And we'd say this is really interesting, but if you look at the text, it's not interesting at all. 566 00:55:59,380 --> 00:56:04,850 It's reuse. It's an inter text. You can't say it has an author. 567 00:56:05,330 --> 00:56:09,590 It circulates. It has some value, but we're not quite sure. 568 00:56:10,040 --> 00:56:12,200 Uh, so this sort of gets us into this notion. 569 00:56:12,620 --> 00:56:21,620 And this is exactly why we always we've been trotting out this, this notion of bath for 15 years, but we never quite got to the end of the quotation. 570 00:56:22,460 --> 00:56:26,690 And then he says, in fact, intertextuality is a condition of any text whatsoever. 571 00:56:27,050 --> 00:56:29,810 It's obviously not reducible to a problem of source and influences, 572 00:56:29,810 --> 00:56:34,879 which is what we've always thought into text is a general field of anonymous formulas, 573 00:56:34,880 --> 00:56:41,420 the origin of which is rarely identifiable of unconscious or automatic quotations given without inverted commas. 574 00:56:41,450 --> 00:56:45,710 So this is how do you find that? How do you understand that? What does that mean? 575 00:56:46,220 --> 00:56:52,760 Uh, and how can we, uh, implement that into our into our networks and then find a come back to, isn't it? 576 00:56:53,240 --> 00:56:56,690 He says, in fact, some texts are really, really hyper textual. 577 00:56:56,690 --> 00:57:00,980 And this is before Tim Berners-Lee. So it's a it's a theoretical hypertext. 578 00:57:01,400 --> 00:57:05,450 Uh, there's no literary work that to some degree in depending on the reading does not evoke another. 579 00:57:05,840 --> 00:57:08,240 Uh, and in this sense, all works are hypertension. 580 00:57:08,630 --> 00:57:14,690 But like Orwell's and Animal Farm, some are more hyper textual, more obviously massively in explicitly than others. 581 00:57:15,050 --> 00:57:20,660 You know, CBD is one of these, uh, Voltaire's questions on security is one of these hyper textual works. 582 00:57:21,050 --> 00:57:25,550 Uh, and we find them again and again in popping up the Republican. 583 00:57:25,730 --> 00:57:32,360 This is a, uh, a beginning, a sort of very preliminary, uh, look at network, uh, measures. 584 00:57:32,660 --> 00:57:36,620 It kept popping up. We didn't even know what it was. It was just in our it was in our corpora. 585 00:57:36,950 --> 00:57:42,410 Uh, and then we looked at it and we said, oh, this makes perfect sense why it would be there and why would have these high scores? 586 00:57:42,740 --> 00:57:51,469 Uh, because it it's a text reuse machine. Uh, and he writes it in at the very beginning of, of the revolution, and he says, 587 00:57:51,470 --> 00:57:54,740 I'm offering it to these students, all these little maxims, all these little princes. 588 00:57:54,740 --> 00:57:58,760 And you have one a day, uh, that you're meant to read and think about and ruminate on. 589 00:57:59,390 --> 00:58:05,600 And it's for the entire Republican calendar. Uh, and so again, we thought, well, this is really interesting. 590 00:58:05,600 --> 00:58:09,499 We should be able to find more than we thought. But in fact, it was a lot of work. 591 00:58:09,500 --> 00:58:13,309 And Dario had to do a lot of hand checking for these different things. 592 00:58:13,310 --> 00:58:15,530 But we can find the sort of modern sources. 593 00:58:15,530 --> 00:58:20,690 There's a lot of Rousseau, there's some Benjamin Franklin, there's some Claude, there's a history of China. 594 00:58:20,900 --> 00:58:25,070 Uh, so it's very syncretic and it's sort of, uh, where these come from. 595 00:58:25,160 --> 00:58:33,020 But there's a whole mass of maxims, uh, of reasons that we, we can't identify and, and that remain unidentifiable. 596 00:58:33,020 --> 00:58:34,400 And so if you just take one day. 597 00:58:35,150 --> 00:58:43,010 So this is the first decade of, uh, one decade of pretty major, uh, you can see the, the different sort of pathways that go through here. 598 00:58:43,010 --> 00:58:44,480 We can't some we haven't found. 599 00:58:44,870 --> 00:58:53,360 It goes from Benjamin Franklin to Voltaire to lawyer, uh, back to Franklin's, uh, Sinek, uh, Benjamin Franklin Duplo, etc. 600 00:58:53,780 --> 00:59:03,139 Uh, so it's a really fascinating work. Uh, it causes you to go in and really get into the text which Dario is proclaimed, as do close reading. 601 00:59:03,140 --> 00:59:08,990 So that's sort of a lovely pun on close reading. Uh, but it's interesting to see how these transformations happen. 602 00:59:09,020 --> 00:59:13,400 Again, there's no indication of the author in this text. There's just the reuse. 603 00:59:13,910 --> 00:59:17,870 Uh, but you can see that there are certain changes that happen for whatever reason in this case, 604 00:59:17,870 --> 00:59:25,489 in the first case with this history of Louis the 12th and Adam is changed into in the Atlantic, a comma changes completely. 605 00:59:25,490 --> 00:59:32,000 The text, it's a different text. So it's it's still the clue is it is a different who does it belong to? 606 00:59:32,570 --> 00:59:35,840 What does it mean? Uh, secondly, Virginia Seaborne. 607 00:59:35,840 --> 00:59:39,680 So this goes back to this notion of originality, less privacy by less pressing. 608 00:59:39,680 --> 00:59:44,690 What is it? So it's more about rationalism. Uh, less than originality. 609 00:59:44,690 --> 00:59:54,380 So this is this sort of hyper textual, um, text, and it gets really wild if you, if you, if you move it out here we have two lines from lucid. 610 00:59:54,920 --> 00:59:58,129 Uh, they appear to be near patterns. Increase their sources. 611 00:59:58,130 --> 01:00:01,280 Seymour, tell who they are and then mark two lines from the. I had you. 612 01:00:01,280 --> 01:00:11,750 Come on. Uh, and also with a little note about you, I had the book from Butler as a feature in, uh, the sixth decade of, uh, number 30. 613 01:00:11,930 --> 01:00:18,410 You have both these lines that have been melded together. Multiple identities, two sources similar to point, but double. 614 01:00:18,770 --> 01:00:24,200 So, Paul, Paul Butler, uh, shows up out of nowhere, just thrown in there. 615 01:00:24,260 --> 01:00:27,860 John. Don't go I don't I won't do it in the future. 616 01:00:28,310 --> 01:00:32,210 Uh, so it's a completely who does this play? Is this corner? Is this Voltaire? 617 01:00:32,510 --> 01:00:36,080 What is Ballard doing there? Uh, what's going on? 618 01:00:36,410 --> 01:00:40,400 Uh, so again, this is this the computer takes you only so far. 619 01:00:40,700 --> 01:00:43,940 Uh, and then you have to just sort of think about what's going on. 620 01:00:44,000 --> 01:00:48,020 And then finally another text that's popped up, which is completely crazy. 621 01:00:48,410 --> 01:00:53,750 And I'll end on this is this text by Jean-Marie Chassagne, who was completely unknown. 622 01:00:53,750 --> 01:00:56,239 It's too bad. Uh, Katrina, Seth isn't here today. 623 01:00:56,240 --> 01:01:02,360 She might be one of the few people with Michelle DeLong who actually knows who this person is, because it showed up in her anthology of. 624 01:01:02,740 --> 01:01:06,280 French poetry from 2008. And he's this sort of illuminator. 625 01:01:06,550 --> 01:01:12,700 He's one of the illuminists or. And Mesmer comes from Lyon and he writes this insane book called The Cataracts of the imagination. 626 01:01:13,030 --> 01:01:17,770 Flood of screwball mania, literary vomit, small encyclopaedic haemorrhage. 627 01:01:18,040 --> 01:01:25,230 Monster of monsters by a minute is the inspired who publishes The Lock of the coffin for years, 628 01:01:25,270 --> 01:01:28,629 so I don't know how you would critically edit that text. 629 01:01:28,630 --> 01:01:31,950 Where do you put it? Where do you map that up? 630 01:01:31,960 --> 01:01:35,230 The vision really gets a little more specific, and it's an insane book, 631 01:01:35,230 --> 01:01:42,520 but it's really interesting because I think it's the culmination of this really open poetics of reuse for the 18th century. 632 01:01:42,520 --> 01:01:46,749 And he has this preface position. So he says, I'm not going to write a preface. 633 01:01:46,750 --> 01:01:52,800 And then he writes 100 pages of preface where it says, I'm just weaving together here prose and verse. 634 01:01:52,810 --> 01:01:56,440 I call on all the annals to my aid. I call on all the authors are review them. 635 01:01:56,440 --> 01:02:02,770 As the new dictator of the literary republic. I indulge in the factions who make this names, etc. so you get sort of where he's going. 636 01:02:02,980 --> 01:02:04,360 My pamphlet fleshed out. 637 01:02:04,570 --> 01:02:11,740 It's four volumes, by the way, not quite a pamphlet fleshed out in this way with historical traits swollen with the simulations of the mind, 638 01:02:11,760 --> 01:02:17,530 studded with quotations of all kinds, becomes a literary mosaic. And this goes back to what I said at the beginning, and you can see it. 639 01:02:17,830 --> 01:02:20,709 He takes great inspiration from one Montana because he said, 640 01:02:20,710 --> 01:02:24,640 Montana is great because he who put a chapter title and then [INAUDIBLE] give you a chapter. 641 01:02:24,640 --> 01:02:27,130 It has nothing to do with the title. And he thinks that's amazing. 642 01:02:27,520 --> 01:02:34,389 Uh, and then so there so you can see the sort of inspiration from Bell in all of his notes, notes to notes. 643 01:02:34,390 --> 01:02:40,090 So he has one for one sentence here. Revolt. The uncle used to keep too young. 644 01:02:40,240 --> 01:02:43,900 Some more. There's a there's there's an asterix note. 645 01:02:44,470 --> 01:02:48,820 There's another note that goes on for two more pages. Just commentary himself. 646 01:02:48,820 --> 01:02:53,800 And he's both. He's a that's a fantastic work. Sometimes he cites where it comes from. 647 01:02:54,250 --> 01:02:58,180 Other times there's nothing. Uh, so again, it's this machine. 648 01:02:58,450 --> 01:03:00,550 It's a, it's a sort of text reuse machine. 649 01:03:00,970 --> 01:03:08,350 Um, I won't read all of his quote, but he gets really into this notion of how he's building, uh, with all the different, 650 01:03:08,740 --> 01:03:14,889 uh, all these different types of texts, what he calls, uh, we have to build a type of encyclopaedia. 651 01:03:14,890 --> 01:03:19,180 So he's taking up this notion of the on site as we come back to it, uh, 652 01:03:19,270 --> 01:03:24,489 we have to use this idea of universality, uh, with a multiplicity of their talents as the poet, 653 01:03:24,490 --> 01:03:28,540 to throw into the same more of the sort of the epigram, the roses, the vegetation, the nettles, 654 01:03:28,540 --> 01:03:34,690 etc. every genre, every author, the entire 18th century should come to bear on this text. 655 01:03:35,080 --> 01:03:37,239 And he continues on this bizarre, overflowing. 656 01:03:37,240 --> 01:03:43,930 So this sort of expansion of wandering thoughts and hidden passages does not require a very strange effort of the imagination. 657 01:03:45,350 --> 01:03:53,480 Uh, imaginative, which I wasn't quite sure what the word is in French, with dictionaries, collections, patient copyists and time. 658 01:03:53,750 --> 01:04:00,979 The most limited scribe is capable of populating the literary universe with encyclopaedic monsters, so to echo hard times, 659 01:04:00,980 --> 01:04:08,660 with the right tools, the right large language model, the right resources, we too could build these sorts of poetic model monsters. 660 01:04:09,050 --> 01:04:14,030 And perhaps that's precisely what language models like GPT are encyclopaedic monsters. 661 01:04:14,910 --> 01:04:21,270 Ditto, of course, already for saw this monstrosity in the making of his own secret belly when he 662 01:04:21,270 --> 01:04:26,580 took stock of their universal universe and sort of universal text in 1755, 663 01:04:27,030 --> 01:04:32,430 far from the perfect system of human understanding that was proposed by Dan on Bear in the in the preface, 664 01:04:32,970 --> 01:04:39,060 in the discord plenum in there, what they had was uneven, unwieldy, and ultimately uncontrollable. 665 01:04:40,110 --> 01:04:45,660 The proof can be found in 100 places in this work. Here we are bloated and exorbitantly fat. 666 01:04:45,900 --> 01:04:48,900 There we are, lean, petty, pathetic, dry and scrawny. 667 01:04:49,230 --> 01:04:53,010 In one place we are skeletal and another we rather seem drop sickle. 668 01:04:53,460 --> 01:05:00,930 We are by turns dwarves and giants, classes and pygmies, standing straight, nicely built and well-proportioned, hunchbacked, lame and deformed. 669 01:05:01,680 --> 01:05:08,579 Add to all these quirks that of a text which is sometimes abstract, recondite and mannered, more often careless, 670 01:05:08,580 --> 01:05:16,110 drawn out and diffuse, and you will compare the work to a monster of the poetic art, or even to something more hideous. 671 01:05:16,110 --> 01:05:22,500 Yet, as the predecessor to Wikipedia and perhaps even the internet, or at the very least the World Wide Web, 672 01:05:23,880 --> 01:05:32,460 and in direct lineage with the other imaginary machines of Charles Babbage and Ada Lovelace, of Alan Turing, of Vannevar Bush. 673 01:05:32,790 --> 01:05:36,809 Uh, the poetics of re-use enacted by the Encyclopaedia, and indeed, 674 01:05:36,810 --> 01:05:42,300 as we've seen over the entirety of the 18th century, is a relatively modern, if not to say contemporary phenomenon. 675 01:05:42,870 --> 01:05:49,170 And we who live now amongst the encyclopaedic monsters of AI, would then do well not to forget our history. 676 01:05:50,350 --> 01:05:53,960 To return very briefly to our illuminated poet. 677 01:05:53,980 --> 01:05:58,210 I tried to find more information of the 19th century biographies of the great French literature. 678 01:05:58,870 --> 01:06:02,050 It just as it was born in died. And it gives a list of his work. 679 01:06:02,680 --> 01:06:08,290 But at the very end there's a very curious note where it says, Mr. Qassim, you left behind a many, 680 01:06:08,410 --> 01:06:13,480 many a manuscript which his brother, a grocer in Leon, used to wrap the medicines in his shop. 681 01:06:14,500 --> 01:06:21,670 Clearly, this is yet another example. Perhaps not so much intractable in computational terms of the poetics of text reuse. 682 01:06:22,150 --> 01:06:22,930 Thank you very much.