Teachnique
      CourseRoadmaps
      Login

      OverviewCommentsUser InputNumbersBooleansHistoryHello World ProgramEnvironment SetupSyntaxVariablesData TypesType CastingUnicode SystemLiteralsOperators

      Control FlowBreak StatementContinue StatementPass StatementNested LoopsDecision MakingIf StatementIf-else StatementNested IF StatementMatch-Case StatementLoopsFor LoopsFor-else LoopsWhile Loops

      FunctionsBuilt-in FunctionsDefault ArgumentsKeyword ArgumentsKeyword-Only ArgumentsPositional ArgumentsPositional-Only ArgumentsArbitrary ArgumentsVariable ScopeFunction AnnotationsModules

      StringSlicing StringsModify StringsString ConcatenationString FormattingEscape CharactersString MethodsString Exercises

      ListsList ExercisesAccess List ItemsChange List ItemsAdd List ItemsRemove List ItemsLoop ListsList ComprehensionSort ListsCopy ListsJoin ListsList Methods

      TuplesAccess Tuple ItemsUpdate TuplesUnpack Tuple ItemsLoop TuplesJoin TuplesTuple MethodsTuple Exercises

      SetsAccess Set ItemsAdd Set ItemsRemove Set ItemsLoop SetsJoin SetsCopy SetsSet OperatorsSet MethodsSet Exercises

      DictionariesDictionary ExercisesAccess Dictionary ItemsChange Dictionary ItemsAdd Dictionary ItemsRemove Dictionary ItemsDictionary View ObjectsLoop DictionariesCopy DictionariesNested DictionariesDictionary Methods

      ArraysAccess Array ItemsAdd Array ItemsRemove Array ItemsLoop ArraysCopy ArraysReverse ArraysSort ArraysJoin ArraysArray MethodsArray Exercises

      File HandlingWrite to FileRead FilesRenaming and Deleting FilesDirectoriesFile Methods

      OOP ConceptsDynamic BindingDynamic TypingAbstractionObject and ClassesEncapsulationInterfacesPackagesInner ClassesAnonymous Class and ObjectsSingleton ClassWrapper ClassesEnumsReflectionClass AttributesClass MethodsStatic MethodsConstructorsAccess ModifiersInheritancePolymorphismMethod OverridingMethod Overloading

      Feedback

      Submit request if you have any questions.

      Course
      Unicode System

      Python Tutorial

      This Python tutorial has been written for the beginners to help them understand the basic to advanced concepts of Python Programming Language. After completing this tutorial, you will find yourself at a great level of expertise in Python, from where you can take yourself to the next levels to become a world class Software Engineer.

      Unicode System

      What is Unicode System?

      Software applications often require to display messages output in a variety in different languages such as in English, French, Japanese, Hebrew, or Hindi. Python's string type uses the Unicode Standard for representing characters. It makes the program possible to work with all these different possible characters.
      A character is the smallest possible component of a text. 'A', 'B', 'C', etc., are all different characters. So are 'È' and 'Í'. A unicode string is a sequence of code points, which are numbers from 0 through 0x10FFFF (1,114,111 decimal). This sequence of code points needs to be represented in memory as a set of code units, and code units are then mapped to 8-bit bytes.

      Character Encoding

      A sequence of code points is represented in memory as a set of code units, mapped to 8-bit bytes. The rules for translating a Unicode string into a sequence of bytes are called a character encoding.
      Three types of encodings are present, UTF-8, UTF-16 and UTF-32. UTF stands for Unicode Transformation Format.

      Python's Unicode Support

      Python 3.0 onwards has built-in support for Unicode. The str type contains Unicode characters, hence any string created using single, double or the triple-quoted string syntax is stored as Unicode. The default encoding for Python source code is UTF-8.
      Hence, string may contain literal representation of a Unicode character (3/4) or its Unicode value (\u00BE).

      Example

      var = "3/4"
      print (var)
      var = "\u00BE"
      print (var)
      This above code will produce the following output
      3/4
      ¾

      Example

      In the following example, a string '10' is stored using the Unicode values of 1 and 0 which are \u0031 and u0030 respectively.
      var = "\u0031\u0030"
      print (var)
      It will produce the following output
      10
      Strings display the text in a human-readable format, and bytes store the characters as binary data. Encoding converts data from a character string to a series of bytes. Decoding translates the bytes back to human-readable characters and symbols. It is important not
      to confuse these two methods. encode is a string method, while decode is a method of the Python byte object.

      Example

      In the following example, we have a string variable that consists of ASCII characters. ASCII is a subset of Unicode character set. The encode() method is used to convert it into a bytes object.
      string = "Hello"
      tobytes = string.encode('utf-8')
      print (tobytes)
      string = tobytes.decode('utf-8')
      print (string)
      The decode() method converts byte object back to the str object. The encodeing method used is utf-8.
      b'Hello'
      Hello

      Example

      In the following example, the Rupee symbol (₹) is stored in the variable using its Unicode value. We convert the string to bytes and back to str.
      string = "\u20B9"
      print (string)
      tobytes = string.encode('utf-8')
      print (tobytes)
      string = tobytes.decode('utf-8')
      print (string)
      When you execute the above code, it will produce the following output
      ₹
      b'\xe2\x82\xb9'
      ₹

      Practice with Online Editor

      Note: This Python online Editor is a Python interpreter written in Rust, RustPython may not fully support all Python standard libraries and third-party libraries yet.
      Remember to save code(Ctrl + S Or Command + S) before run it.