Convert words into binary code. Converting text into digital code

Binary code represents text, computer processor instructions, or other data using any two-character system. Most commonly, it is a system of 0s and 1s that assigns a pattern of binary digits (bits) to each symbol and instruction. For example, a binary string of eight bits can represent any of 256 possible values ​​and can therefore generate many different elements. Reviews of binary code from the global professional community of programmers indicate that this is the basis of the profession and the main law of the functioning of computer systems and electronic devices.

Deciphering the binary code

In computing and telecommunications, binary codes are used for various methods of encoding data characters into bit strings. These methods can use fixed-width or variable-width strings. There are many character sets and encodings for converting to binary code. In fixed-width code, each letter, number, or other character is represented by a bit string of the same length. This bit string, interpreted as a binary number, is usually displayed in code tables in octal, decimal, or hexadecimal notation.

Binary Decoding: A bit string interpreted as a binary number can be converted to a decimal number. For example, the lowercase letter a, if represented by the bit string 01100001 (as in standard ASCII code), can also be represented as the decimal number 97. Converting binary code to text is the same procedure, just in reverse.

How does this work

What does binary code consist of? The code used in digital computers is based on which there are only two possible states: on. and off, usually denoted by zero and one. While in the decimal system, which uses 10 digits, each position is a multiple of 10 (100, 1000, etc.), in the binary system, each digit position is a multiple of 2 (4, 8, 16, etc.). A binary code signal is a series of electrical pulses that represent numbers, symbols, and operations to be performed.

A device called a clock sends out regular pulses, and components such as transistors are turned on (1) or off (0) to transmit or block the pulses. In binary code, each decimal number (0-9) is represented by a set of four binary digits or bits. The four basic operations of arithmetic (addition, subtraction, multiplication, and division) can be reduced to combinations of fundamental Boolean algebraic operations on binary numbers.

A bit in communication and information theory is a unit of data equivalent to the result of a choice between two possible alternatives in the binary number system commonly used in digital computers.

Binary code reviews

The nature of code and data is a basic part of the fundamental world of IT. This tool is used by specialists from the global IT “behind the scenes” - programmers whose specialization is hidden from the attention of the average user. Reviews of binary code from developers indicate that this area requires a deep study of mathematical fundamentals and extensive practice in the field of mathematical analysis and programming.

Binary code is the simplest form of computer code or programming data. It is entirely represented by a binary digit system. According to reviews of binary code, it is often associated with machine code because binary sets can be combined to form source code that is interpreted by a computer or other hardware. This is partly true. uses sets of binary digits to form instructions.

Along with the most basic form of code, a binary file also represents the smallest amount of data that flows through all the complex, end-to-end hardware and software systems that process today's resources and data assets. The smallest amount of data is called a bit. The current strings of bits become code or data that is interpreted by the computer.

Binary number

In mathematics and digital electronics, a binary number is a number expressed in the base-2 number system, or binary numeric system, which uses only two characters: 0 (zero) and 1 (one).

The base-2 number system is a positional notation with a radius of 2. Each digit is referred to as a bit. Due to its simple implementation in digital electronic circuits using logical rules, the binary system is used by almost all modern computers and electronic devices.

Story

The modern binary number system as the basis for binary code was invented by Gottfried Leibniz in 1679 and presented in his article "Binary Arithmetic Explained". Binary numbers were central to Leibniz's theology. He believed that binary numbers symbolized the Christian idea of ​​creativity ex nihilo, or creation out of nothing. Leibniz tried to find a system that would transform verbal statements of logic into purely mathematical data.

Binary systems that predate Leibniz also existed in the ancient world. An example is the Chinese binary system I Ching, where the divination text is based on the duality of yin and yang. In Asia and Africa, slotted drums with binary tones were used to encode messages. The Indian scholar Pingala (circa 5th century BC) developed a binary system to describe prosody in his work Chandashutrema.

The inhabitants of the island of Mangareva in French Polynesia used a hybrid binary-decimal system until 1450. In the 11th century, the scientist and philosopher Shao Yong developed a method of organizing hexagrams that corresponds to the sequence 0 to 63, as represented in a binary format, with yin being 0 and yang being 1. The order is also a lexicographical order in blocks of elements selected from a two-element set.

New time

In 1605, discussed a system in which the letters of the alphabet could be reduced to sequences of binary digits, which could then be encoded as subtle variations of type in any random text. It is important to note that it was Francis Bacon who supplemented the general theory of binary coding with the observation that this method can be used with any objects.

Another mathematician and philosopher named George Boole published a paper in 1847 called “Mathematical Analysis of Logic,” which described the algebraic system of logic known today as Boolean algebra. The system was based on a binary approach, which consisted of three basic operations: AND, OR and NOT. This system did not become operational until an MIT graduate student named Claude Shannon noticed that the Boolean algebra he was learning was similar to an electrical circuit.

Shannon wrote a dissertation in 1937 that made important findings. Shannon's thesis became the starting point for the use of binary code in practical applications such as computers and electrical circuits.

Other forms of binary code

Bitstring is not the only type of binary code. A binary system in general is any system that allows only two options, such as a switch in an electronic system or a simple true or false test.

Braille is a type of binary code widely used by blind people to read and write by touch, named after its creator Louis Braille. This system consists of grids of six points each, three per column, in which each point has two states: raised or recessed. Different combinations of dots can represent all letters, numbers, and punctuation marks.

American Standard Code for Information Interchange (ASCII) uses a 7-bit binary code to represent text and other characters in computers, communications equipment, and other devices. Each letter or symbol is assigned a number from 0 to 127.

Binary coded decimal or BCD is a binary coded representation of integer values ​​that uses a 4-bit graph to encode decimal digits. Four binary bits can encode up to 16 different values.

In BCD-encoded numbers, only the first ten values ​​in each nibble are valid and encode the decimal digits with zeros after nines. The remaining six values ​​are invalid and may cause either a machine exception or unspecified behavior, depending on the computer's implementation of BCD arithmetic.

BCD arithmetic is sometimes preferred over floating point number formats in commercial and financial applications where complex number rounding behavior is undesirable.

Application

Most modern computers use a binary code program for instructions and data. CDs, DVDs, and Blu-ray Discs represent audio and video in binary form. Telephone calls are carried digitally in long-distance and mobile telephone networks using pulse code modulation and in voice over IP networks.

Everyone knows that computers can perform calculations on large groups of data at enormous speed. But not everyone knows that these actions depend on only two conditions: whether there is current or not and what voltage.

How does a computer manage to process such a variety of information?
The secret lies in the binary number system. All data enters the computer, presented in the form of ones and zeros, each of which corresponds to one state of the electrical wire: ones - high voltage, zeros - low, or ones - the presence of voltage, zeros - its absence. Converting data into zeros and ones is called binary conversion, and its final designation is called binary code.
In decimal notation, based on the decimal number system used in everyday life, a numerical value is represented by ten digits from 0 to 9, and each place in the number has a value ten times higher than the place to the right of it. To represent a number greater than nine in the decimal system, a zero is placed in its place, and a one is placed in the next, more valuable place to the left. Similarly, in the binary system, which uses only two digits - 0 and 1, each place is twice as valuable as the place to the right of it. Thus, in binary code only zero and one can be represented as single numbers, and any number greater than one requires two places. After zero and one, the next three binary numbers are 10 (read one-zero) and 11 (read one-one) and 100 (read one-zero-zero). 100 binary is equivalent to 4 decimal. The top table on the right shows other BCD equivalents.
Any number can be expressed in binary, it just takes up more space than in decimal. The alphabet can also be written in the binary system if a certain binary number is assigned to each letter.

Two figures for four places
16 combinations can be made using dark and light balls, combining them in sets of four. If dark balls are taken as zeros and light balls as ones, then 16 sets will turn out to be a 16-unit binary code, the numerical value of which is from zero to five ( see top table on page 27). Even with two types of balls in the binary system, an infinite number of combinations can be built simply by increasing the number of balls in each group - or the number of places in the numbers.

Bits and bytes

The smallest unit in computer processing, a bit is a unit of data that can have one of two possible conditions. For example, each of the ones and zeros (on the right) represents 1 bit. A bit can be represented in other ways: the presence or absence of electric current, a hole or its absence, the direction of magnetization to the right or left. Eight bits make up a byte. 256 possible bytes can represent 256 characters and symbols. Many computers process one byte of data at a time.

Binary conversion. Four-digit binary code can represent decimal numbers from 0 to 15.

Code tables

When binary code is used to represent letters of the alphabet or punctuation marks, code tables are required that indicate which code corresponds to which character. Several such codes have been compiled. Most PCs are configured with a seven-digit code called ASCII, or American Standard Code for Information Interchange. The table on the right shows the ASCII codes for the English alphabet. Other codes are for thousands of characters and alphabets of other languages ​​of the world.

Part of an ASCII code table

08. 06.2018

Blog of Dmitry Vassiyarov.

Binary code - where and how is it used?

Today I am especially glad to meet you, my dear readers, because I feel like a teacher who, at the very first lesson, begins to introduce the class to letters and numbers. And since we live in a world of digital technology, I will tell you what binary code is, which is their basis.

Let's start with the terminology and find out what binary means. For clarification, let’s return to our usual calculus, which is called “decimal”. That is, we use 10 digits, which make it possible to conveniently operate with various numbers and keep appropriate records. Following this logic, the binary system provides for the use of only two characters. In our case, these are just “0” (zero) and “1” one. And here I want to warn you that hypothetically there could be other symbols in their place, but it is precisely these values, indicating the absence (0, empty) and the presence of a signal (1 or “stick”), that will help us further understand the structure of the binary code.

Why is binary code needed?

Before the advent of computers, various automatic systems were used, the operating principle of which was based on receiving a signal. The sensor is triggered, the circuit is closed and a certain device is turned on. No current in the signal circuit - no operation. It was electronic devices that made it possible to achieve progress in processing information represented by the presence or absence of voltage in a circuit.

Their further complication led to the emergence of the first processors, which also did their job, processing a signal consisting of pulses alternating in a certain way. We will not delve into the program details now, but the following is important for us: electronic devices turned out to be able to distinguish a given sequence of incoming signals. Of course, it is possible to describe the conditional combination this way: “there is a signal”; "no signal"; “there is a signal”; "there is a signal." You can even simplify the notation: “there is”; "No"; "There is"; "There is".

But it is much easier to denote the presence of a signal with a unit “1”, and its absence with a zero “0”. Then we can use a simple and concise binary code instead: 1011.

Of course, processor technology has stepped far forward and now chips are able to perceive not just a sequence of signals, but entire programs written with specific commands consisting of individual characters. But to record them, the same binary code is used, consisting of zeros and ones, corresponding to the presence or absence of a signal. Whether he exists or not, it doesn’t matter. For a chip, any of these options is a single piece of information, which is called a “bit” (bit is the official unit of measurement).

Conventionally, a symbol can be encoded as a sequence of several characters. Two signals (or their absence) can describe only four options: 00; 01;10; 11. This encoding method is called two-bit. But it can also be:

  • four-bit (as in the example in the paragraph above 1011) allows you to write 2^4 = 16 character combinations;
  • eight-bit (for example: 0101 0011; 0111 0001). At one time it was of greatest interest to programming because it covered 2^8 = 256 values. This made it possible to describe all decimal digits, the Latin alphabet and special characters;
  • sixteen-bit (1100 1001 0110 1010) and higher. But records with such a length are already for modern, more complex tasks. Modern processors use 32 and 64-bit architecture;

Frankly, there is no single official version, but it so happened that it was the combination of eight characters that became the standard measure of stored information called a “byte.” This could be applied even to one letter written in 8-bit binary code. So, my dear friends, please remember (if anyone didn’t know):

8 bits = 1 byte.

That's how it is. Although a character written with a 2 or 32-bit value can also nominally be called a byte. By the way, thanks to binary code we can estimate the volume of files measured in bytes and the speed of information and Internet transmission (bits per second).

Binary encoding in action

To standardize the recording of information for computers, several coding systems have been developed, one of which, ASCII, based on 8-bit recording, has become widespread. The values ​​in it are distributed in a special way:

  • the first 31 characters are control characters (from 00000000 to 00011111). Serve for service commands, output to a printer or screen, sound signals, text formatting;
  • the following from 32 to 127 (00100000 – 01111111) Latin alphabet and auxiliary symbols and punctuation marks;
  • the rest, up to the 255th (10000000 – 11111111) – alternative, part of the table for special tasks and displaying national alphabets;

The decoding of the values ​​​​in it is shown in the table.

If you think that “0” and “1” are located in a chaotic order, then you are deeply mistaken. Using any number as an example, I will show you a pattern and teach you how to read numbers written in binary code. But for this we will accept some conventions:

  • we will read a byte of 8 characters from right to left;
  • if in ordinary numbers we use the digits of ones, tens, hundreds, then here (reading in reverse order) for each bit various powers of “two” are represented: 256-124-64-32-16-8- 4-2-1;
  • Now we look at the binary code of the number, for example 00011011. Where there is a “1” signal in the corresponding position, we take the values ​​of this bit and sum them up in the usual way. Accordingly: 0+0+0+32+16+0+2+1 = 51. You can verify the correctness of this method by looking at the code table.

Now, my inquisitive friends, you not only know what binary code is, but also know how to convert the information encrypted by it.

Language understandable to modern technology

Of course, the algorithm for reading binary code by processor devices is much more complicated. But you can use it to write down anything you want:

  • text information with formatting options;
  • numbers and any operations with them;
  • graphic and video images;
  • sounds, including those beyond our hearing range;

In addition, due to the simplicity of the “presentation”, various ways of recording binary information are possible: HDD disks;

The advantages of binary coding are complemented by almost unlimited possibilities for transmitting information over any distance. This is the method of communication used with spacecraft and artificial satellites.

So, today the binary number system is a language that is understood by most of the electronic devices we use. And what’s most interesting is that no other alternative is foreseen for now.

I think that the information I have presented will be quite enough for you to get started. And then, if such a need arises, everyone will be able to delve deeper into an independent study of this topic. I will say goodbye and after a short break I will prepare for you a new article on my blog on some interesting topic.

It's better if you tell me it yourself ;)

See you soon.

The set of characters with which text is written is called alphabet.

The number of characters in the alphabet is its power.

Formula for determining the amount of information: N=2b,

where N is the power of the alphabet (number of characters),

b – number of bits (information weight of the symbol).

The alphabet, with a capacity of 256 characters, can accommodate almost all the necessary characters. This alphabet is called sufficient.

Because 256 = 2 8, then the weight of 1 character is 8 bits.

The unit of measurement 8 bits was given the name 1 byte:

1 byte = 8 bits.

The binary code of each character in computer text takes up 1 byte of memory.

How is text information represented in computer memory?

The convenience of byte-by-byte character encoding is obvious because a byte is the smallest addressable part of memory and, therefore, the processor can access each character separately when processing text. On the other hand, 256 characters is quite a sufficient number to represent a wide variety of symbolic information.

Now the question arises, which eight-bit binary code to assign to each character.

It is clear that this is a conditional matter; you can come up with many encoding methods.

All characters of the computer alphabet are numbered from 0 to 255. Each number corresponds to an eight-bit binary code from 00000000 to 11111111. This code is simply the serial number of the character in the binary number system.

A table in which all characters of the computer alphabet are assigned serial numbers is called an encoding table.

Different types of computers use different encoding tables.

The table has become the international standard for PCs ASCII(read aski) (American Standard Code for Information Interchange).

The ASCII code table is divided into two parts.

Only the first half of the table is the international standard, i.e. symbols with numbers from 0 (00000000), up to 127 (01111111).

ASCII encoding table structure

Serial number

Code

Symbol

0 - 31

00000000 - 00011111

Symbols with numbers from 0 to 31 are usually called control symbols.
Their function is to control the process of displaying text on the screen or printing, sounding a sound signal, marking up text, etc.

32 - 127

00100000 - 01111111

Standard part of the table (English). This includes lowercase and uppercase letters of the Latin alphabet, decimal numbers, punctuation marks, all kinds of parentheses, commercial and other symbols.
Character 32 is a space, i.e. empty position in the text.
All others are reflected by certain signs.

128 - 255

10000000 - 11111111

Alternative part of the table (Russian).
The second half of the ASCII code table, called the code page (128 codes, starting from 10000000 and ending with 11111111), can have different options, each option has its own number.
The code page is primarily used to accommodate national alphabets other than Latin. In Russian national encodings, characters from the Russian alphabet are placed in this part of the table.

First half of the ASCII code table


Please note that in the encoding table, letters (uppercase and lowercase) are arranged in alphabetical order, and numbers are ordered in ascending order. This observance of lexicographical order in the arrangement of characters is called the principle of sequential coding of the alphabet.

For letters of the Russian alphabet, the principle of sequential coding is also observed.

Second half of the ASCII code table


Unfortunately, there are currently five different Cyrillic encodings (KOI8-R, Windows. MS-DOS, Macintosh and ISO). Because of this, problems often arise with transferring Russian text from one computer to another, from one software system to another.

Chronologically, one of the first standards for encoding Russian letters on computers was KOI8 ("Information Exchange Code, 8-bit"). This encoding was used back in the 70s on computers of the ES computer series, and from the mid-80s it began to be used in the first Russified versions of the UNIX operating system.

From the early 90s, the time of dominance of the MS DOS operating system, the CP866 encoding remains ("CP" means "Code Page", "code page").

Apple computers running the Mac OS operating system use their own Mac encoding.

In addition, the International Standards Organization (ISO) has approved another encoding called ISO 8859-5 as a standard for the Russian language.

The most common encoding currently used is Microsoft Windows, abbreviated CP1251.

Since the late 90s, the problem of standardizing character encoding has been solved by the introduction of a new international standard called Unicode. This is a 16-bit encoding, i.e. it allocates 2 bytes of memory for each character. Of course, this increases the amount of memory occupied by 2 times. But such a code table allows the inclusion of up to 65536 characters. The complete specification of the Unicode standard includes all the existing, extinct and artificially created alphabets of the world, as well as many mathematical, musical, chemical and other symbols.

Let's try using an ASCII table to imagine what words will look like in the computer's memory.

Internal representation of words in computer memory

Sometimes it happens that a text consisting of letters of the Russian alphabet received from another computer cannot be read - some kind of “abracadabra” is visible on the monitor screen. This happens because computers use different character encodings for the Russian language.

Because it is the simplest and meets the requirements:

  • The fewer values ​​there are in the system, the easier it is to manufacture individual elements that operate on these values. In particular, two digits of the binary number system can be easily represented by many physical phenomena: there is a current - there is no current, the magnetic field induction is greater than a threshold value or not, etc.
  • The fewer states an element has, the higher the noise immunity and the faster it can operate. For example, to encode three states through the magnitude of the magnetic field induction, you will need to enter two threshold values, which will not contribute to noise immunity and reliability of information storage.
  • Binary arithmetic is quite simple. Simple are the tables of addition and multiplication - the basic operations with numbers.
  • It is possible to use the apparatus of logical algebra to perform bitwise operations on numbers.

Links

  • Online calculator for converting numbers from one number system to another

Wikimedia Foundation. 2010.

See what “Binary code” is in other dictionaries:

    2-bit Gray code 00 01 11 10 3-bit Gray code 000 001 011 010 110 111 101 100 4-bit Gray code 0000 0001 0011 0010 0110 0111 0101 0100 1100 1101 1111 11 10 1010 1011 1001 1000 Gray code a number system in which two adjacent values ... ... Wikipedia

    The Signal Point Code (SPC) of Signal System 7 (SS7, OX 7) is a unique (in the home network) node address used at the third MTP level (routing) in telecommunication OX 7 networks for identification ... Wikipedia

    In mathematics, a square-free number is a number that is not divisible by any square except 1. For example, 10 is square-free, but 18 is not, since 18 is divisible by 9 = 32. The beginning of the sequence of square-free numbers is: 1, 2, 3, 5, 6, 7,… … Wikipedia

    To improve this article, would you like to: Wikify the article. Rework the design in accordance with the rules for writing articles. Correct the article according to Wikipedia stylistic rules... Wikipedia

    This term has other meanings, see Python (meanings). Python Language class: mu... Wikipedia

    In the narrow sense of the word, the phrase currently means “Attempt on a security system,” and tends rather to the meaning of the following term, Cracker attack. This happened due to a distortion of the meaning of the word “hacker” itself. Hacker... ...Wikipedia