Python bytes type

You must have learnt about different data types in python such as strings and numeric data types like integers and floating point numbers. In this article you will learn about another data type called bytes. You will study the underlying concepts behind bytes in python and will implement different types of operations on bytes to understand the concepts.

What are bytes in Python?

Generally, when we save any data in secondary storage, it is encoded according to a certain type of encoding such as ASCII, UTF-8, and UTF-16 for strings, PNG, JPG and JPEG for images, and mp3 and wav for audio files and is turned into a byte object. When we access the data again using python read file operation, it is decoded into the corresponding text, image, or audio. Byte objects contain data that are machine-readable and we can store a byte object directly into secondary storage. 

In python, we can explicitly create byte objects from other data such as lists, strings etc.

Python bytes type

How to create bytes in Python?

To create byte objects we can use the bytes() function. The bytes() function takes three parameters as input all of which are optional. The object which has to be converted to bytes is passed as the first parameter. Second and third parameters are used only when the first parameter is  string. In this case, the second parameter is the encoding of the string and the third parameter is the name of the error response which is executed when the encoding fails. The bytes() function returns an immutable byte object. In the next sections, we will understand the working of bytes() function by creating bytes objects from different data objects.

Create a bytes object of given size

To create a bytes object of any given size, we will pass the size as input to the bytes() method and a bytes object of the required size is created which is initialized to all zeros. This can be understood from the following example.

bytes_obj = bytes(10)
print("The bytes object is:", bytes_obj)
print("Size of the bytes object is:", len(bytes_obj) )

Output:

The bytes object is: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
Size of the bytes object is: 10

Convert String to bytes

To convert a string to bytes object, we will pass the string as first input and encoding as the second input to the bytes() function. There is also a third argument for error response but can be avoided for simplicity at this time. The function returns a bytes object with the encoded string. This can be understood as follows.

myString = "Pythonforbeginners.com"
print("The given string is:" , myString)
bytes_obj = bytes(myString , "UTF-8")
print("The bytes object is:", bytes_obj)
print("Size of the bytes object is:", len(bytes_obj) )

Output:

The given string is: Pythonforbeginners.com
The bytes object is: b'Pythonforbeginners.com'
Size of the bytes object is: 22

Convert a list to bytes

We can also convert any iterable object like list or tuple to bytes object using bytes() function.To perform this operation, we simply pass the iterable object to the bytes() function which returns the corresponding bytes object.Remember that bytes object is immutable and cannot be modified. We can convert a list into bytes using bytes() function as follows.

myList = [1,2,3,4,5]
print("The given list is:" , myList)
bytes_obj = bytes(myList)
print("The bytes object is:", bytes_obj)
print("Size of the bytes object is:", len(bytes_obj) )

Output:

The given list is: [1, 2, 3, 4, 5]
The bytes object is: b'\x01\x02\x03\x04\x05'
Size of the bytes object is: 5

Remember that the list passed to bytes() function should only contain elements. Passing s list with floating point numbers or strings will cause bytes() function to throw TypeError.

Conclusion

In this article, we have seen what bytes objects are and how we can create bytes objects from iterables and strings using bytes() method.We can also write the programs used in this article with exception handling using python try except to make the programs more robust and handle errors in a systematic way. Stay tuned for more informative articles.

Related

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

In the last lesson, you saw how you could create a bytes object using a string literal with the addition of a 'b' prefix. In this lesson, you’ll learn how to use bytes() to create a bytes object. You’ll explore three different forms of using bytes():

  1. >>> c = bytes(8)
    >>> c
    b'\x00\x00\x00\x00\x00\x00\x00\x00'
    >>>len(c)
    8
    
    1 creates a bytes object from a string.
  2. >>> c = bytes(8)
    >>> c
    b'\x00\x00\x00\x00\x00\x00\x00\x00'
    >>>len(c)
    8
    
    2 creates a bytes object consisting of null (0x00) bytes.
  3. >>> c = bytes(8)
    >>> c
    b'\x00\x00\x00\x00\x00\x00\x00\x00'
    >>>len(c)
    8
    
    3 creates a bytes object from an iterable.

Here’s how to use

>>> c = bytes(8)
>>> c
b'\x00\x00\x00\x00\x00\x00\x00\x00'
>>>len(c)
8
1:

>>>

>>> a = bytes('bacon and egg', 'utf8')
>>> a
b'bacon and egg'
>>> type(a)
<class 'bytes'>

>>> b = bytes('Hello ∑ €', 'utf8')
>>> b
b'Hello \xe2\x88\x91 \xe2\x82\xac'

>>> len(a)
13
>>> a
b'bacon and egg'
>>> b
b'Hello \xe2\x88\x91 \xe2\x82\xac'
>>> len(b)
13
>>> a[0]
98
>>> a[1]
97
>>> a[2]
99
>>> b[0]
72
>>> b[1]
101
>>> b[5]
32
>>> b[6]
226
>>> b[7]
136
>>> b[8]
145

Here’s how to use

>>> c = bytes(8)
>>> c
b'\x00\x00\x00\x00\x00\x00\x00\x00'
>>>len(c)
8
2:

>>>

>>> c = bytes(8)
>>> c
b'\x00\x00\x00\x00\x00\x00\x00\x00'
>>>len(c)
8

Here’s how to use

>>> c = bytes(8)
>>> c
b'\x00\x00\x00\x00\x00\x00\x00\x00'
>>>len(c)
8
3:

>>>

>>> d = bytes([115, 112, 97, 109, 33])
>>> d
b'spam!'
>>> type(d)
<class 'bytes'>
>>> d[0]
115
>>> d[3]
109

theramstoss on

Question for you: why does bytes(‘\x80’, ‘utf8’) evaluate to b’\xc2\x80’ ?

Thank you!

Python bytes type

Chris Bailey RP Team on

Hi @theamstoss,

You are heading in a deeper direction when you start to look at encodings. The bytes6 standard encodes in multiple byte sizes. This and there will be a video course for it soon. They really do a good deep dive. Here is a code snippet from the article, showing characters just outside the ASCII group, in this case they have accents, being encoded in utf-8 as 2 bytes. But the other ASCII characters are single letters.

>>> "résumé".encode("utf-8")
b'r\xc3\xa9sum\xc3\xa9'
>>> "El Niño".encode("utf-8")
b'El Ni\xc3\xb1o'

>>> b"r\xc3\xa9sum\xc3\xa9".decode("utf-8")
'résumé'
>>> b"El Ni\xc3\xb1o".decode("utf-8")
'El Niño'

The value you have picked of bytes7 is equal to 128, and takes you just out of ASCII and the lower 0-127.

Bhavesh Sharma on

Have difficulty in understanding the concept in this video. If the ASCII characters have a max value from 0-127 then how how does it accept 0-255 in length. Not getting it .

Python bytes type

Bartosz Zaczyński RP Team on

@Bhavesh Sharma The original ASCII standard allocated only 7 bits corresponding to values between 0 and 127 to represent Latin alphabet letters, digits, punctuation, and a few other symbols. It was enough for the hardware of the sixties. The later addition of the 8th bit allowed for implementing various extensions known as code pages, which made it possible to use more exotic characters like ąćęłńóśźż.

Biggz78 on

Hi, I am new to learning Python, I’ve bought the book or books a while ago, and just this week i started to study, the book suggests me to this course and that all good, am like String Manipulation, check, then we move on to build-in String Methods, cool cool, i can follow, and then section 3 ....

Bytes Objects,

And am like, UHMMMM, WTF you lost me, is this the way to learn? going from beginner, just learning about strings, and then going over to Byte Object, I dont even know what they are or what they do…

isn’t this too early? or did I miss something here?

Python bytes type

Bartosz Zaczyński RP Team on

@Biggz78 It might sound strange jumping from string to bytes, but don’t let the “bytes” name scare you away. The reason why bytes are mentioned right after strings is that both data types in Python share a wealth of common attributes and are closely related. They’re both sequences that behave almost the same. When you look at their attributes, you’ll notice that most are identical:

>>> len(set(dir(str)) | set(dir(bytes)))
83
>>> len(set(dir(str)) & set(dir(bytes)))
72

Strings and bytes have 83 attributes combined, 72 of which are the same. Also, you can go from strings to bytes and the other way around:

>>> "Hello, World!".encode("utf-8")
b'Hello, World!'
>>> b"Hello, World!".decode("utf-8")
'Hello, World!'

Biggz78 on

@Bartosz Zaczyński,

I understand where you’re coming from, but try to understand my point of view, Have no clue what bytes are… so all the “attributes” (whatever they are) they share, is all fun and that, for the more seasoned Python user.

I am learning from the book, Python Basics, and just finished chapter 5 today. Assuming you know the book, no mention about bytes, so then to be referred by the book to this resource (end of chapter 4, String and String Methods) again, not having gotten to bytes just doesn’t make sense.

Biggz78 on

@Bartosz Zaczyński,

I think I made a “boo boo”. I might have come across this tutorial by accident, went back to the book to “fact check” and there is no link to this course but to:

Python String Formatting Best Practices

and

Splitting, Concatenating, and Joining Strings in Python

I think I might have made an search error or something, but I ended up with the wrong tutorial, so my bad, and my apologies.

What is type bytes?

In short, the bytes type is a sequence of bytes that have been encoded and are ready to be stored in memory/disk. There are many types of encodings (utf-8, utf-16, windows-1255), which all handle the bytes differently. The bytes object can be decoded into a str type. The str type is a sequence of unicode characters.

How do you declare a byte data type in Python?

For a single byte, you basically have three choices: A length 1 bytes (or bytearray ) object mychar = b'\xff' (or mychar = bytearray(b'\xff') ) An int that you don't assign values outside range(256) (or use masking to trim overflow): mychar = 0xff. A ctypes type, e.g. mychar = ctypes.

How to read bytes in Python?

you can use bin(ord('b')). replace('b', '') bin() it gives you the binary representation with a 'b' after the last bit, you have to remove it. Also ord() gives you the ASCII number to the char or 8-bit/1 Byte coded character. Save this answer.

What is Python bytes string?

In Python, a byte string is just that: a sequence of bytes. It isn't human-readable. Under the hood, everything must be converted to a byte string before it can be stored in a computer. On the other hand, a character string, often just called a "string", is a sequence of characters.