Cache language model

Cache language model

A cache language model is a type of statistical language model. These occur in the natural language processing subfield of computer science and assign probabilities to given sequences of words by means of a probability distribution. Statistical language models are key components of speech recognition systems and of many machine translation systems: they tell such systems which possible output word sequences are probable and which are improbable. The particular characteristic of a cache language model is that it contains a cache component and assigns relatively high probabilities to words or word sequences that occur elsewhere in a given text. The primary, but by no means sole, use of cache language models is in speech recognition systems. To understand why it is a good idea for a statistical language model to contain a cache component one might consider someone who is dictating a letter about elephants to a speech recognition system. Standard (non-cache) N-gram language models will assign a very low probability to the word "elephant" because it is a very rare word in English. If the speech recognition system does not contain a cache component, the person dictating the letter may be annoyed: each time the word "elephant" is spoken another sequence of words with a higher probability according to the N-gram language model may be recognized (e.g., "tell a plan"). These erroneous sequences will have to be deleted manually and replaced in the text by "elephant" each time "elephant" is spoken. If the system has a cache language model, "elephant" will still probably be misrecognized the first time it is spoken and will have to be entered into the text manually; however, from this point on the system is aware that "elephant" is likely to occur again – the estimated probability of occurrence of "elephant" has been increased, making it more likely that if it is spoken it will be recognized correctly. Once "elephant" has occurred several times, the system is likely to recognize it correctly every time it is spoken until the letter has been completely dictated. This increase in the probability assigned to the occurrence of "elephant" is an example of a consequence of machine learning and more specifically of pattern recognition. There exist variants of the cache language model in which not only single words but also multi-word sequences that have occurred previously are assigned higher probabilities (e.g., if "San Francisco" occurred near the beginning of the text subsequent instances of it would be assigned a higher probability). The cache language model was first proposed in a paper published in 1990, after which the IBM speech-recognition group experimented with the concept. The group found that implementation of a form of cache language model yielded a 24% drop in word-error rates once the first few hundred words of a document had been dictated. A detailed survey of language modeling techniques concluded that the cache language model was one of the few new language modeling techniques that yielded improvements over the standard N-gram approach: "Our caching results show that caching is by far the most useful technique for perplexity reduction at small and medium training data sizes". The development of the cache language model has generated considerable interest among those concerned with computational linguistics in general and statistical natural language processing in particular: recently, there has been interest in applying the cache language model in the field of statistical machine translation. The success of the cache language model in improving word prediction rests on the human tendency to use words in a "bursty" fashion: when one is discussing a certain topic in a certain context, the frequency with which one uses certain words will be quite different from their frequencies when one is discussing other topics in other contexts. The traditional N-gram language models, which rely entirely on information from a very small number (four, three, or two) of words preceding the word to which a probability is to be assigned, do not adequately model this "burstiness". Recently, the cache language model concept – originally conceived for the N-gram statistical language model paradigm – has been adapted for use in the neural paradigm. For instance, recent work on continuous cache language models in the recurrent neural network (RNN) setting has applied the cache concept to much larger contexts than before, yielding significant reductions in perplexity. Another recent line of research involves incorporating a cache component in a feed-forward neural language model (FN-LM) to achieve rapid domain adaptation.

Eigenface

An eigenface ( EYE-gən-) is the name given to a set of eigenvectors when used in the computer vision problem of human face recognition. The approach of using eigenfaces for recognition was developed by Sirovich and Kirby and used by Matthew Turk and Alex Pentland in face classification. The eigenvectors are derived from the covariance matrix of the probability distribution over the high-dimensional vector space of face images. The eigenfaces themselves form a basis set of all images used to construct the covariance matrix. This produces dimension reduction by allowing the smaller set of basis images to represent the original training images. Classification can be achieved by comparing how faces are represented by the basis set. == History == The eigenface approach began with a search for a low-dimensional representation of face images. Sirovich and Kirby showed that principal component analysis could be used on a collection of face images to form a set of basis features. These basis images, known as eigenpictures, could be linearly combined to reconstruct images in the original training set. If the training set consists of M images, principal component analysis could form a basis set of N images, where N < M. The reconstruction error is reduced by increasing the number of eigenpictures; however, the number needed is always chosen less than M. For example, if you need to generate a number of N eigenfaces for a training set of M face images, you can say that each face image can be made up of "proportions" of all the K "features" or eigenfaces: Face image1 = (23% of E1) + (2% of E2) + (51% of E3) + ... + (1% En). In 1991 M. Turk and A. Pentland expanded these results and presented the eigenface method of face recognition. In addition to designing a system for automated face recognition using eigenfaces, they showed a way of calculating the eigenvectors of a covariance matrix such that computers of the time could perform eigen-decomposition on a large number of face images. Face images usually occupy a high-dimensional space and conventional principal component analysis was intractable on such data sets. Turk and Pentland's paper demonstrated ways to extract the eigenvectors based on matrices sized by the number of images rather than the number of pixels. Once established, the eigenface method was expanded to include methods of preprocessing to improve accuracy. Multiple manifold approaches were also used to build sets of eigenfaces for different subjects and different features, such as the eyes. == Generation == A set of eigenfaces can be generated by performing a mathematical process called principal component analysis (PCA) on a large set of images depicting different human faces. Informally, eigenfaces can be considered a set of "standardized face ingredients", derived from statistical analysis of many pictures of faces. Any human face can be considered to be a combination of these standard faces. For example, one's face might be composed of the average face plus 10% from eigenface 1, 55% from eigenface 2, and even −3% from eigenface 3. Remarkably, it does not take many eigenfaces combined together to achieve a fair approximation of most faces. Also, because a person's face is not recorded by a digital photograph, but instead as just a list of values (one value for each eigenface in the database used), much less space is taken for each person's face. The eigenfaces that are created will appear as light and dark areas that are arranged in a specific pattern. This pattern is how different features of a face are singled out to be evaluated and scored. There will be a pattern to evaluate symmetry, whether there is any style of facial hair, where the hairline is, or an evaluation of the size of the nose or mouth. Other eigenfaces have patterns that are less simple to identify, and the image of the eigenface may look very little like a face. The technique used in creating eigenfaces and using them for recognition is also used outside of face recognition: handwriting recognition, lip reading, voice recognition, sign language/hand gestures interpretation and medical imaging analysis. Therefore, some do not use the term eigenface, but prefer to use 'eigenimage'. === Practical implementation === To create a set of eigenfaces, one must: Prepare a training set of face images. The pictures constituting the training set should have been taken under the same lighting conditions, and must be normalized to have the eyes and mouths aligned across all images. They must also be all resampled to a common pixel resolution (r × c). Each image is treated as one vector, simply by concatenating the rows of pixels in the original image, resulting in a single column with r × c elements. For this implementation, it is assumed that all images of the training set are stored in a single matrix T, where each column of the matrix is an image. Subtract the mean. The average image a has to be calculated and then subtracted from each original image in T. Calculate the eigenvectors and eigenvalues of the covariance matrix S. Each eigenvector has the same dimensionality (number of components) as the original images, and thus can itself be seen as an image. The eigenvectors of this covariance matrix are therefore called eigenfaces. They are the directions in which the images differ from the mean image. Usually this will be a computationally expensive step (if at all possible), but the practical applicability of eigenfaces stems from the possibility to compute the eigenvectors of S efficiently, without ever computing S explicitly, as detailed below. Choose the principal components. Sort the eigenvalues in descending order and arrange eigenvectors accordingly. The number of principal components k is determined arbitrarily by setting a threshold ε on the total variance. Total variance ⁠ v = ( λ 1 + λ 2 + . . . + λ n ) {\displaystyle v=(\lambda _{1}+\lambda _{2}+...+\lambda _{n})} ⁠, n = number of components, and λ {\displaystyle \lambda } represents component eigenvalue. k is the smallest number that satisfies ( λ 1 + λ 2 + . . . + λ k ) v > ϵ {\displaystyle {\frac {(\lambda _{1}+\lambda _{2}+...+\lambda _{k})}{v}}>\epsilon } These eigenfaces can now be used to represent both existing and new faces: we can project a new (mean-subtracted) image on the eigenfaces and thereby record how that new face differs from the mean face. The eigenvalues associated with each eigenface represent how much the images in the training set vary from the mean image in that direction. Information is lost by projecting the image on a subset of the eigenvectors, but losses are minimized by keeping those eigenfaces with the largest eigenvalues. For instance, working with a 100 × 100 image will produce 10,000 eigenvectors. In practical applications, most faces can typically be identified using a projection on between 100 and 150 eigenfaces, so that most of the 10,000 eigenvectors can be discarded. === Matlab example code === Here is an example of calculating eigenfaces with Extended Yale Face Database B. To evade computational and storage bottleneck, the face images are sampled down by a factor 4×4=16. Note that although the covariance matrix S generates many eigenfaces, only a fraction of those are needed to represent the majority of the faces. For example, to represent 95% of the total variation of all face images, only the first 43 eigenfaces are needed. To calculate this result, implement the following code: === Computing the eigenvectors === Performing PCA directly on the covariance matrix of the images is often computationally infeasible. If small images are used, say 100 × 100 pixels, each image is a point in a 10,000-dimensional space and the covariance matrix S is a matrix of 10,000 × 10,000 = 108 elements. However the rank of the covariance matrix is limited by the number of training examples: if there are N training examples, there will be at most N − 1 eigenvectors with non-zero eigenvalues. If the number of training examples is smaller than the dimensionality of the images, the principal components can be computed more easily as follows. Let T be the matrix of preprocessed training examples, where each column contains one mean-subtracted image. The covariance matrix can then be computed as S = TTT and the eigenvector decomposition of S is given by S v i = T T T v i = λ i v i {\displaystyle \mathbf {Sv} _{i}=\mathbf {T} \mathbf {T} ^{T}\mathbf {v} _{i}=\lambda _{i}\mathbf {v} _{i}} However TTT is a large matrix, and if instead we take the eigenvalue decomposition of T T T u i = λ i u i {\displaystyle \mathbf {T} ^{T}\mathbf {T} \mathbf {u} _{i}=\lambda _{i}\mathbf {u} _{i}} then we notice that by pre-multiplying both sides of the equation with T, we obtain T T T T u i = λ i T u i {\displaystyle \mathbf {T} \mathbf {T} ^{T}\mathbf {T} \mathbf {u} _{i}=\lambda _{i}\mathbf {T} \mathbf {u} _{i}} Meaning that, if ui is an eigenvector of TTT, then vi = Tui is an eigenvector of S. If we have

Vintage computer

A vintage computer is an older computer system that is largely regarded as obsolete. The personal computer has been around since around 1971, and in that time technological advancement means existing models get replaced every few years. Nevertheless, these otherwise useless computers have spawned a sub-culture of vintage computer collectors who often spend large sums for the rarest examples, not only to display but functionally restore. This involves active software development and adaptation to modern uses. This often includes homebrew developers and hackers who add on, update and create hybrid composites from new and old computers for uses they were otherwise never intended. Ethernet interfaces have been designed for many vintage 8-bit machines to allow limited connectivity to the Internet, where users can access discussion groups, bulletin boards, and software databases. Most of this hobby centers on computers made after 1960, though some collectors also specialize in older computers. The Vintage Computer Festival, an event held by the Vintage Computer Federation for the exhibition and celebration of vintage computers, has been held annually since 1997 and has expanded internationally. == By platform == === MITS Inc. === Micro Instrumentation and Telemetry Systems (MITS) produced the Altair 8800 in 1975. According to Harry Garland, the Altair 8800 was the product that catalyzed the microcomputer revolution of the 1970s. === IMSAI === The IMSAI 8080 is a clone of the Altair 8800. It was introduced in 1975, first as a kit, and later as an assembled system. The list price was $591 (equivalent to $3,584 in 2025) for a kit, and $931 (equivalent to $5,570 in 2025) assembled. === Processor Technology === Processor Technology produced the Sol-20. This was one of the first machines to have a case that included a keyboard; a design feature copied by many of later "home computers". === SWTPC === Southwest Technical Products Corporation (SWTPC) produced the 8-bit SWTPC 6800 and later the 16-bit SWTPC 6809 kits that employed the Motorola 68xx series microprocessors. === Apple Inc. === The earliest Apple Inc. personal computers, using the MOS Technology 6502 processors, are among some of the most collectible. They are relatively easy to maintain in an operational state thanks to Apple's use of readily available off-the-shelf parts. Apple I (1976): The Apple-1 was Apple's first product and has brought some of the highest prices ever paid for a microcomputer at auction. Apple II (1977): The Apple II series of computers are some of the easiest to adapt, thanks to the original expansion architecture designed for them. New peripheral cards are still being designed by an avid thriving community, thanks to the longevity of this platform, manufactured from 1977 through 1993. Numerous websites exist to support not only legacy users but new adopters who weren't even born when the Apple II was discontinued by Apple. Macintosh (1984): The original Macintosh used a 32-bit Motorola 68000 processor running at 7.8336 MHz and came with 128 KB of RAM. The list price was $2495 (equivalent to $7,732 in 2025).Perhaps because of its friendly design and first commercially successful graphical user interface as well as its enduring Finder application that persists on the most current Macs, the Macintosh is one of the most collected and used vintage computers. With dozens of websites around the world, old Macintosh hardware and software are input into daily use. The Macintosh had a strong presence in many early computer labs, creating a nostalgia factor for former students who recall their first computing experiences. === RCA === The COSMAC Elf in 1976 was an inexpensive (about $100) single-board computer that was easily built by hobbyists. Many people who could not afford an Altair could afford an ELF, which was based on the RCA 1802 chip. Because the chips are still available from other sources, modern recreations of the ELF are fairly common and there are several fan websites. === IBM === The IBM 1130 (1965) was a desk-sized small computer. It was the often the first computer used by many college students, still has a following of interested users. Most of the remaining 1130 systems in 2023 are in museums, but an emulator is available for users who don't have access to a physical 1130. The 5100 also has an avid collector and fan base. The PC series (5150 PC, 5155 Portable PC, 5160 PC/XT, 5170 PC/AT) has become very popular in recent years, with the earliest models (PC) being considered the most collectible. === Acorn BBC & Archimedes === The Acorn BBC Micro was a very popular British computer in the 1980s with home and educational users and enjoyed near-universal usage in British schools into the mid-1990s. It was possible to use 100K 5+1⁄4-inch disks, and it had many expansion ports. The Archimedes series – the de facto successor to the BBC Micro – has also enjoyed a following in recent years, thanks to its status as the first computer to be based around ARM's RISC microprocessor. === Tandy/Radio Shack === The Tandy/RadioShack Model 100 is still widely collected and used as one of the earliest examples of a truly portable computer. Other Tandy offerings, such as the TRS-80 line, are also very popular, and early systems, like the Model I, in good condition can command premium prices on the vintage computer market. === Sinclair === The Sinclair ZX81 and ZX Spectrum series were the most popular British home computers of the early 1980s, with a wide choice of emulators available for both platforms. The Spectrum in particular enjoys a cult following due to its popularity as a games platform, with new games titles still being developed even today. Original "rubber key" Spectrums fetch the highest prices on the second-hand market, with the later Amstrad-built models attracting less of a following. The earlier ZX81 is not as popular in original hardware form due to its monochrome display and limited abilities next to the Spectrum, but still unassembled ZX81 kits still appear on eBay occasionally. === MSX === Although nearly nonexistent in the United States, the MSX architecture has strong communities of fans and hobbyists worldwide, particularly in Japan (where the standard was conceived and developed), South Korea (the only country that had an MSX-based game console, Zemmix), Netherlands, Spain, Brazil, Argentina, Russia, Chile, the Middle East, and others. New hardware and software are being actively developed to this day as well. One of the latest fundamental (from hardware and software perspectives) revivals of the MSX is the GR8BIT. === Robotron === The Robotron Z1013 was an East German home computer produced by VEB Robotron. It had a U880 processor, 16 KB RAM, and a membrane keyboard. The KC 85 series of computers was a modular 8-bit computer system used in East German schools. === Commodore === VIC-20 Commodore 64 Commodore PET Amiga === Xerox === The Xerox Alto, designed and manufactured by Xerox PARC and released in 1973, was the first personal computer equipped with a graphic user interface. In 1979, Steve Jobs of Apple Inc. arranged for his engineers to visit Xerox in order to see the Alto. The design concepts of the Alto soon appeared in the Apple Lisa and Macintosh systems. The Xerox Star, also known as the 8010/40, was made available in 1981. It followed on the Alto. Like the Alto, this machine was expensive and was only intended for corporate office usage. Therefore, being out of the price range of the average user, this product had little market penetration. === Silicon Graphics === The SGI Indy, built in 1993 for Silicon Graphics has a history of usage in the development of the Nintendo 64 as well as various CGI projects throughout the 1990s and early 2000s. The Indy and other machines in the SGI lineup have remained cult classics.

T.38

T.38 is an ITU recommendation for allowing transmission of fax over IP networks (FoIP) in real time. == History == The T.38 fax relay standard was devised in 1998 as a way to transport faxes across IP networks between existing Group 3 (G3) fax terminals. T.4 and related fax standards were published by the ITU in 1980, before the rise of the Internet. In the late 1990s, VoIP, or voice over IP, began to gain ground as an alternative to the conventional public switched telephone network (PSTN). However, because most VoIP systems are optimized (through their use of aggressive lossy bandwidth-saving compression) for voice rather than data calls, conventional fax machines worked poorly or not at all on them due to the network impairments such as delay, jitter, packet loss, and so on. Thus, some way of transmitting fax over IP was needed. == Overview == In practical scenarios, a T.38 fax call has at least part of the call being carried over PSTN, although this is not required by the T.38 definition, and two T.38 devices can send faxes to each other. This particular type of device is called Internet-Aware Fax device, or IAF, and it is capable of initiating or completing a fax call towards the IP network. The typical scenario where T.38 is used is – T.38 fax relay – where a T.30 fax device sends a fax over PSTN to a T.38 fax gateway which converts or encapsulates the T.30 protocol into a T.38 data stream. This is then sent either to a T.38-enabled end point such as fax machine or fax server or another T.38 gateway that converts it back to a PSTN PCM or analog signal and terminates the fax on a T.30 device. The T.38 recommendation defines the use of both TCP and UDP to transport T.38 packets. Implementations tend to use UDP, due to TCP's requirement for acknowledgement packets and resulting retransmission during packet loss, which introduces delays. When using UDP, T.38 copes with packet loss by using redundant data packets. T.38 is not a call setup protocol, thus the T.38 devices need to use standard call setup protocols to negotiate the T.38 call, e.g. H.323, SIP & MGCP. == Operation == There are two primary ways that fax transactions are conveyed across packet networks. The T.37 standard specifies how a fax image is encapsulated in e-mail and transported, ultimately, to the recipient using a store-and-forward process through intermediary entities. T.38, however, defines a protocol that supports the use of the T.30 protocol in both the sender and recipient terminals. (See diagram above.) T.38 lets one transmit a fax across an IP network in real time, just as the original G3 fax standards did for the traditional (time-division multiplexed (TDM)) network, also called the public switched telephone network or PSTN. A special protocol is needed for real-time fax over IP (Internet Protocol) since existing fax terminals only supported PSTN connections, where the information flow was generally smooth and uninterrupted, as opposed to the jittery arrival of IP packets. The trick was to come up with a protocol that makes the IP network “invisible” to the endpoint fax terminals, which would mean the user of a legacy fax terminal need not know that the fax call was traversing an IP network. The network interconnections supported by T.38 are shown above. The two fax terminals on either side of the figure communicate using the T.30 fax protocol published by the ITU in 1980. Interconnection of the PSTN with the IP packet network requires a “gateway” between the PSTN and IP networks. PSTN-IP Gateways support TDM voice on the PSTN side and VoIP and FoIP on the packet side. For voice sessions, the gateway will take in voice packets on the IP side, accumulate a few packets to ensure a smooth flow of TDM data upon their release, and then meter them out over TDM where they eventually are heard by a human or stored on a computer for later playback. The gateway employs packet-management techniques to enhance the quality of the speech in the presence of network errors by taking advantage of the natural ability of a listener to not really hear the occasional missing or repeated packet. But facsimile data are transmitted by modems, which aren't as forgiving as the human ear is for speech. Missing packets will often cause a fax session to fail at worst or create one or more image lines in error at best. So the job of T.38 is to “fool” the terminal into “thinking” that it's communicating directly with another T.30 terminal. It will also correct for network delays with so-called spoofing techniques, and missing or delayed packets with fax-aware buffer-management techniques. Spoofing refers to the logic implemented in the protocol engine of a T.38 relay that modifies the protocol commands and responses on the TDM side to keep network delays on the IP side from causing the transaction to fail. This is done, for example, by padding image lines or deliberately causing a message to be re-transmitted to render network delays transparent to the sending/receiving fax terminals. Networks that do not have packet loss or excessive delay can exhibit acceptable fax performance without T.38, provided the PCM clocks in all gateways are of very high accuracy (explained below). T.38 not only removes the effect of PCM clocks not being synchronized, but also reduces the required network bandwidth by a factor of 10, while it corrects for packet loss and delay. === Bandwidth reduction === As shown in the diagram below, a T.38 gateway is composed of two primary elements: the fax modems and the T.38 subsystem. The fax modems modulate and demodulate the PCM samples of the analog data, turning the sampled-data representation of the fax terminal's analog signal to its binary translation, and vice versa. The PSTN network samples the analog signal of a voice or modem signal (it doesn't know the difference) 8,000 times per second (SPS), and encodes them as 8-bit data bytes. This means 8000 samples-per-second times 8-bits per sample, or 64,000 bits per second (bit/s) to represent the modem (or voice) data in one direction. For both directions the modem transaction consumes 128,000 bits of network bandwidth. However, the typical modem in a fax terminal transmits the image data at 33,600 bit/s, so if the analog data are first converted to the digital content they represent, only 33,600 bits (plus network overhead of a few bytes) are needed. And since T.30 fax is a half-duplex protocol, the network is only needed for one direction at a time. Refer to RFC 3261 === PCM clock synchronization === In the diagram above, there is a sample-rate clock in the fax terminal and one in the gateway's modems that is used to trigger the sampling of the analog line 8,000 times per second. These clocks are usually quite accurate, but in some low-cost terminal adapters (a one or two-line gateway) the PCM clock can be surprisingly inaccurate. If the terminal is sending data to the gateway, and the gateway's clock is too slow, the buffers (jitter buffers) in the gateway will eventually overflow, causing the transaction to fail. Since the difference is often quite small, this problem occurs on long, detailed fax images giving the clocks more time to cause the jitter buffer in gateway to either underflow or overflow, which is just the same as missing or duplicated packets. === Packet loss === T.38 provides facilities to eliminate the effects of packet loss through data redundancy. When a packet is sent, either zero, one, two, three, or even more of the previously sent packets are repeated. (The specification does not impose a limit.) This increases the network bandwidth required (it's still much less than not using T.38) but it allows the receiving gateway to reconstruct the complete packet sequence, even with a fairly high level of packet loss. == Related standards == T.4 is the umbrella specification for fax. It specifies the standard image sizes, two forms of image-data compression (encoding), the image-data format, and references, T.30 and the various modem standards. T.6 specifies a compression scheme that reduces the time required to transmit an image by roughly 50-percent. T.30 specifies the procedures that a sending and receiving terminal use to set up a fax call, determine the image size, encoding, and transfer speed, the demarcation between pages, and the termination of the call. T.30 also references the various modem standards. V.21, V.27ter, V.29, V.17, V.34: ITU modem standards used in facsimile. The first three were ratified prior to 1980, and were specified in the original T.4 and T.30 standards. V.34 was published for fax in 1994. T.37 The ITU standard for sending a fax-image file via e-mail to the intended recipient of a fax. G.711 pass through - this is where the T.30 fax call is carried in a VoIP call encoded as audio. This is sensitive to network packet loss, jitter and clock synchronization. When using voice high-compression encoding techniques such as, but not limited to, G.729, some fax tonal signa

Deplatforming

Deplatforming, also known as no-platforming, is a boycott on an individual or group by removing the platforms used to share their information or ideas. The term is commonly associated with social media. == History == === Deplatforming of invited speakers === In the United States, the banning of speakers on university campuses dates back to the 1940s. This was carried out by the policies of the universities themselves. The University of California had a policy known as the Speaker Ban, codified in university regulations under President Robert Gordon Sproul, that mostly, but not exclusively, targeted communists. One rule stated that "the University assumed the right to prevent exploitation of its prestige by unqualified persons or by those who would use it as a platform for propaganda." This rule was used in 1951 to block Max Shachtman, a socialist, from speaking at the University of California at Berkeley. In 1947, former U.S. Vice President Henry A. Wallace was banned from speaking at UCLA because of his views on U.S. Cold War policy, and in 1961, Malcolm X was prohibited from speaking at Berkeley as a religious leader. Controversial speakers invited to appear on college campuses have faced deplatforming attempts to disinvite them or to otherwise prevent them from speaking. The British National Union of Students established its No Platform policy as early as 1973. In the mid-1980s, visits by South African ambassador Glenn Babb to Canadian college campuses faced opposition from students opposed to apartheid. In the United States, recent examples include the March 2017 disruption by protestors of a public speech at Middlebury College by political scientist Charles Murray. In February 2018, students at the University of Central Oklahoma rescinded a speaking invitation to creationist Ken Ham, after pressure from an LGBT student group. In March 2018, a "small group of protesters" at Lewis & Clark Law School attempted to stop a speech by visiting lecturer Christina Hoff Sommers. In the 2019 film No Safe Spaces, Adam Carolla and Dennis Prager documented their own disinvitation along with others. As of February 2020, the Foundation for Individual Rights in Education, a speech advocacy group, documented 469 disinvitation or disruption attempts at American campuses since 2000, including both "unsuccessful disinvitation attempts" and "successful disinvitations"; the group defines the latter category as including three subcategories: formal disinvitation by the sponsor of the speaking engagement; the speaker's withdrawal "in the face of disinvitation demands"; and "heckler's vetoes" (situations when "students or faculty persistently disrupt or entirely prevent the speakers' ability to speak"). === Deplatforming in social media === Beginning in 2015, Reddit banned several communities on the site ("subreddits") for violating the site's anti-harassment policy. A 2017 study published in the journal Proceedings of the ACM on Human-Computer Interaction, examining "the causal effects of the ban on both participating users and affected communities," found that "the ban served a number of useful purposes for Reddit" and that "Users participating in the banned subreddits either left the site or (for those who remained) dramatically reduced their hate speech usage. Communities that inherited the displaced activity of these users did not suffer from an increase in hate speech." In June 2020 and January 2021, Reddit also issued bans to pro-Trump communities over violations of the website's content and harassment policies. On May 2, 2019, Facebook and the Facebook-owned platform Instagram announced a ban of "dangerous individuals and organizations" including Nation of Islam leader Louis Farrakhan, Milo Yiannopoulos, Alex Jones and his organization InfoWars, Paul Joseph Watson, Laura Loomer, and Paul Nehlen. In the wake of the 2021 storming of the US Capitol, Twitter banned then-president Donald Trump, as well as 70,000 other accounts linked to the event and the far-right movement QAnon. Some studies have found that the deplatforming of extremists reduced their audience, although other research has found that some content creators became more toxic following deplatforming and migration to alt-tech platform. ==== Twitter ==== On November 18, 2022, Elon Musk, as newly appointed CEO of Twitter, reopened previously banned Twitter accounts of high-profile users, including Kathy Griffin, Jordan Peterson, and The Babylon Bee as part of the new Twitter policy. As Musk exclaimed, "New Twitter policy is freedom of speech, but not freedom of reach". ==== Alex Jones ==== On August 6, 2018, Facebook, Apple, YouTube and Spotify removed all content by Jones and InfoWars for policy violations. YouTube removed channels associated with InfoWars, including The Alex Jones Channel. On Facebook, four pages associated with InfoWars and Alex Jones were removed over repeated policy violations. Apple removed all podcasts associated with Jones from iTunes. On August 13, 2018, Vimeo removed all of Jones's videos because of "prohibitions on discriminatory and hateful content". Facebook cited instances of dehumanizing immigrants, Muslims and transgender people, as well as glorification of violence, as examples of hate speech. After InfoWars was banned from Facebook, Jones used another of his websites, NewsWars, to circumvent the ban. Jones's accounts were also removed from Pinterest, Mailchimp and LinkedIn. As of early August 2018, Jones retained active accounts on Instagram, Google+ and Twitter. In September, Jones was permanently banned from Twitter and Periscope after berating CNN reporter Oliver Darcy. On September 7, 2018, the InfoWars app was removed from the Apple App Store for "objectionable content". He was banned from using PayPal for business transactions, having violated the company's policies by expressing "hate or discriminatory intolerance against certain communities and religions." After Elon Musk's purchase of Twitter several previously banned accounts were reinstated including Donald Trump, Andrew Tate and Ye resulting in questioning if Alex Jones will be unbanned as well. However Musk denied that Alex Jones will be unbanned criticizing Jones as a person that "would use the deaths of children for gain, politics or fame". InfoWars remained available on Roku devices in January 2019, a year after the channel's removal from multiple streaming services. Roku indicated that they do not "curate or censor based on viewpoint," and that it had policies against content that is "unlawful, incited illegal activities, or violates third-party rights," but that InfoWars was not in violation of these policies. Following a social media backlash, Roku removed InfoWars and stated "After the InfoWars channel became available, we heard from concerned parties and have determined that the channel should be removed from our platform." In March 2019, YouTube terminated the Resistance News channel due to its reuploading of live streams from InfoWars. On May 1, 2019, Jones was barred from using both Facebook and Instagram. Jones briefly moved to Dlive, but was suspended in April 2019 for violating community guidelines. In March 2020, the InfoWars app was removed from the Google Play store due to claims of Jones disseminating COVID-19 misinformation. A Google spokesperson stated that "combating misinformation on the Play Store is a top priority for the team" and apps that violate Play policy by "distributing misleading or harmful information" are removed from the store. ==== Donald Trump ==== On January 6, 2021, in a joint session of the United States Congress, the counting of the votes of the Electoral College was interrupted by a breach of the United States Capitol chambers. The rioters were supporters of President Donald Trump who hoped to delay and overturn the President's loss in the 2020 election. The event resulted in five deaths and at least 400 people being charged with crimes. The certification of the electoral votes was only completed in the early morning hours of January 7, 2021. In the wake of several Tweets by President Trump on January 7, 2021 Facebook, Instagram, YouTube, Reddit, and Twitter all deplatformed Trump to some extent. Twitter deactivated his personal account, which the company said could possibly be used to promote further violence. Trump subsequently tweeted similar messages from the President's official US Government account @POTUS, which resulted in him being permanently banned on January 8. Twitter then announced that Trump's ban from their platform would be permanent. Trump planned to rejoin on social media through the use of a new platform by May or June 2021, according to Jason Miller on a Fox News broadcast. The same week Musk announced Twitter's new freedom of speech policy, he tweeted a poll to ask whether to bring back Trump into the platform. The poll ended with 51.8% in favor of unbanning Trump's account. Twitter has since reinstated Trump's Twitter accou

Averbis

Averbis has a focus on healthcare, pharma, automotive and intellectual property analytics. Averbis is involved in various research projects of the German Federal Ministry of Economics and Energy and the European Union such as DebugIT, EUCases, Mantra and SEMCARE. In addition to these projects, Averbis was also involved in the following projects: Greenpilot is a virtual library, which provides technical information in the fields of nutrition, environment and agriculture. Medpilot is a virtual library, which provides information about medicine and related sciences. In 2013, Averbis has been nominated for the German Founder Prize 2013. Averbis GmbH provides text analytics and text mining software to transform unstructured text into actionable information. It was founded in 2007 by IT experts after years of relevant scientific experience in the field of text mining and multilingual information retrieval. Averbis works in the field of terminology management, natural language processing, machine learning and semantic search. Its text mining software is embedded into the text mining framework UIMA.

European Information Technology Observatory

The European Information Technology Observatory (EITO) gathers information on European and global markets for information technology, telecommunications and consumer electronics. The EITO is managed by Bitkom Research GmbH, a wholly owned subsidiary of BITKOM, the German Association for Information Technology, Telecommunications and New Media. EITO is sponsored by Deutsche Telekom, KPMG and Telecom Italia. The research activities of the EITO Task Force are supported by the European Commission and the OECD. The EITO exists thanks to an initiative of Enore Deotto from MIlan and the support of Luis-Alberto Petit Herrera (Madrid), Jörg Schomburg (Hanover) and Günther Möller (Frankfurt). Between 1993 and 2007, the market reports were published as printed annual reports ("EITO yearbook"). Since 2008 the market reports are available in electronic version and can be purchased on the EITO online portal. Currently, the ICT market reports are divided in following categories: International Reports International Reports include ICT market information of all EITO countries and all market segments or only specific segments. The newest ICT Market Report 2013/14, published in October 2013, includes market data of 36 countries: 28 European markets, BRIC countries, Japan, Turkey and the US as well as a deep analysis of ICT market developments in 9 European countries. The detailed market data and forecasts are available for the period 2010–2014. Country Reports This category includes EITO reports on a single country's ICT market. The Country ICT Market Reports are published biannually for France, Germany, Italy, Spain and the United Kingdom. Thematic Reports Thematic studies focusing on a specific topic. Customized Reports Market Reports made upon order.