Читать книгу Bots - Nick Monaco - Страница 31

Notes

Оглавление

1  1 Chatbot has now become the most common term for conversational agents, but “chatterbot,” a term coined by famous bot builder Michael Mauldin in the 1990s, was then a common term for the same phenomenon (Deryugina, 2010; Leonard, 1997, p. 4).

2  2 ELIZA’s pattern matching and substitution methods work in the same way that “regular expressions” or “RegEx” work today. RegEx is a formal language for pattern-matching in human language. It is often used to help computers search for and detect pre-defined patterns in human language (words, sentences, phone numbers, URLs, etc.) as a pre-processing step for modern natural language processing programs (Jurafsky & Martin, 2018, pp. 9–10).

3  3 The fundamental importance of bots as a sense-making and infrastructural part of the internet is one of the primary reasons why laws or regulations advocating a blanket ban on bots would destroy the modern web as we know it.

4  4 Of course, modern computer networks and the internet use dozens of protocols in addition to HTTP, such as the transmission control protocol (TCP), internet protocol (IP), and simple mail transfer protocol (SMTP) – all used millions of times every day. All of these small parts are necessary cogs that make up the machinery of the modern internet (Frystyk, 1994; Shuler, 2002).

5  5 The first graphical MUDs did not begin to appear until the mid 1980s (Castronova, 2005, p. 55).

6  6 Early web indexing bots were also called by other names, including “wanderers,” “worms,” “fish,” “walkers,” “knowbots,” and “web robots,” among others (Gudivada et al., 1997).

7  7 While HTML is the primary language that web developers use to build webpages, other languages, such as CSS and JavaScript, provide very important secondary functions for websites and are essential building blocks for modern websites.

8  8 Today, the ubiquity of web-indexing crawler bots on the World Wide Web are one aspect of what distinguishes the everyday internet from what is known as the “dark web.” In contrast, the dark web is an alternative form of the internet, which requires additional software, protocols, and technical knowledge to access – Tor, I2P, Freenet, ZeroNet, GNUnet, are but a few of the possible tools that can be used to access the dark web (Gehl, 2018). While security enthusiasts, researchers, and cybersecurity firms can build crawler bots to explore the dark web, large-scale centralized search engines like Google do not exist on the dark web. Navigation of the dark web is therefore mainly conducted through word-of-mouth within small communities, or on smaller scale search engines that resemble the “web directories” of the early internet. Much of the dark web is made of sites that are not indexed by crawler bots at all. Much of the activity that takes place on the dark web is meant to be clandestine (such as online crime, illegal marketplaces, and censorship circumvention websites and tools). The dark web also allows users and publishers to remain anonymous online.

9  9 Martijn Koster, one of the most prominent bot developers and bot thinkers in the 1990s, also built a database of known crawlers (or “Web Robots,” in the parlance of the times) that is still online (Koster, n.d.).

10 10 The “deep web” is also a concept worth noting. Deep web sites are sites that require special permissions (such as a password) to access and cannot be read or seen without that access. For example, while facebook.com itself is indexable and readable from all search engines, particular Facebook users’ profiles and posts may not be visible in search engine results due to individual privacy settings. So, while facebook.com itself is part of the clear web, unviewable and unindexed profiles would qualify as being part of the deep web.

11 11 The Robot Exclusion Standard is also known as the Robot Exclusion Protocol.

12 12 In addition to APIs and headless browser automation tools, there is a third option for programming online bots: Graphical User Interface (GUI) automation packages like Python’s pyautogui library. These packages enable users to program automated behavior on a local computer, automating things like cursor movement and keyboards (Kiehl, 2012; Sweigart, 2015, pp. 438–439). However, like using automated browser software, this is a more time-consuming and difficult way to build a bot that generally requires developers with significant technical skill.

13 13 As Gorwa and Guilbeault note, researchers sometimes choose to write social bot as one word, socialbot (2018).

14 14 The term propaganda itself is similar in this way – while the English term carries an exclusively negative connotation, in other languages such as Chinese, the term for “propaganda” (宣傳) is more neutral in tone, with a sense of general promotion or public relations of any sort (Jack, 2017).

Bots

Подняться наверх