Protocol Buffers

Thanks! I’ll put together a beginner-friendly guide to understanding proto3 files from scratch, including their syntax, purpose, and how to use them in Android development with Kotlin. I’ll also include a real-world example and explain how the generated Kotlin code can be used.

I’ll let you know when everything is ready.

Understanding Protocol Buffers (Proto3) for Android Development with Kotlin

What are Protocol Buffers and Why Use Them?

Protocol Buffers (protobuf) are Google’s language-neutral, platform-neutral mechanism for serializing structured data – think of it like JSON or XML, but smaller, faster, and simpler (Overview | Protocol Buffers Documentation). With protobuf, you define your data’s structure once in a .proto file, and the protobuf compiler generates code in multiple languages (Java, Kotlin, C++, Python, etc.) so you can easily read and write that structured data across different systems (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium). This makes it ideal for scenarios where you need efficient, cross-platform data exchange – for example, sending data between an Android app and a server, or saving complex data to disk on a mobile device.

Protobuf uses a binary format that is much more compact and faster to parse than text-based formats like JSON (Overview | Protocol Buffers Documentation) (Working with Proto DataStore | Android Developers). Because the data is encoded in binary using predefined numeric tags, messages are typically much smaller in size and quicker to serialize/deserialize. In fact, protocol buffers are widely used at Google for high-performance inter-service communication and storing data, due to their efficiency and the ability to evolve data formats over time without breaking compatibility (Overview | Protocol Buffers Documentation).

A key advantage of protobuf is that it enforces a schema. You explicitly define the structure (fields and data types) in the .proto file, which leads to strongly-typed data in your code. This can prevent errors (for example, type mismatches or missing fields) that might occur with unstructured formats. Android’s Jetpack DataStore is a real-world example that leverages protobuf: its Proto DataStore variant uses protocol buffers to store typed data (like user preferences) with type-safety and efficiency, which is a big improvement over plain text or SharedPreferences (Working with Proto DataStore | Android Developers).

When to use protobuf: Use Protocol Buffers when you need a compact and efficient format for structured data, especially if your app interacts with services or modules written in other languages. It’s great for network payloads, configuration files, or persistent storage where performance matters. Because protobuf messages are strictly structured, they can be evolved (you can add new fields later) without breaking older code that doesn’t know about those new fields – unknown fields are simply ignored, which aids in forward compatibility. On the other hand, if human-readability or ad-hoc flexibility is more important than performance (e.g. for simple configs or debugging data), formats like JSON or XML might be easier to work with. In summary, protobuf shines for speed, size, and cross-language consistency, making it well-suited for Android apps that require efficient data storage or communication.

Proto3 Syntax: Building a Schema with `.proto` Files

A Protocol Buffers schema is defined in a .proto file using the proto3 syntax. In this section, we’ll break down the various components of a proto3 file, including the syntax declaration, message definitions and fields, enumerations, packages and imports, service definitions (for RPC), and options. By understanding these, you’ll be able to read and write your own proto3 files for your Android/Kotlin projects.

Syntax Version Declaration

Every proto file should start by declaring which syntax (version) it uses. For proto3, the first non-comment line of the file is usually:

syntax = "proto3";

This line tells the protobuf compiler that the file uses the proto3 language version (Language Guide (proto 3) | Protocol Buffers Documentation). If you omit this, the compiler would assume you are using proto2 (an older version), which has some differences. Always include syntax = "proto3"; at the top of your file to avoid confusion.

Defining Messages and Fields

A message in protobuf is a container for fields, somewhat analogous to a class or a data structure. You define a message with the message keyword followed by the message name and a block of field definitions. For example:

message Person {
  string name = 1;
  int32  id   = 2;
  string email = 3;
}

This defines a message Person with three fields: name, id, and email (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). Each field in a message has three key parts: a type, a name, and a tag number. In the example, the types are string and int32 (32-bit integer), the names are name, id, email, and the tag numbers are 1, 2, 3 respectively.

Field types: Protobuf supports many scalar value types such as integers (int32, int64, uint32, etc.), floating-point numbers (float, double), booleans (bool), strings, and bytes (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). You can also use composite types:

Other message types as field types (allowing you to nest structured messages).
Enumerations (enums) for fields that should be one of a predefined set of values.
Special types like oneof (discussed later) for mutually exclusive fields, and map<key_type, value_type> for key-value pairs.

Each field is assigned a unique tag number which identifies the field in the binary data. Tag numbers must be in the range 1–[2^29 - 1] (with some reserved ranges), and within a single message no two fields can have the same tag. These tag numbers are used instead of field names in the binary encoding, which is one reason protobuf data is so compact. In the example, name has tag 1, id tag 2, etc. The order of fields in the message definition doesn’t matter for serialization, but choosing lower tag numbers for frequently-used fields can save a tiny bit of space (tags 1–15 encode to 1 byte, whereas tags 16 and above encode to 2+ bytes) (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium) (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). However, this is an optimization detail; the main rule is to never change a field’s tag once it’s in use (changing tags would break compatibility with data encoded with the old tags).

Default values: In proto3, fields are optional by nature. Unlike proto2, there is no required keyword in proto3 (every field can be omitted). If a field isn’t set in a message, a default value is used when reading the message: for numeric types the default is 0, for strings it’s the empty string, for booleans it’s false, and for enums it’s the first value (which should correspond to 0) (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium) (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). For message-type fields, the default is essentially an “empty” message of that type. This means you don’t have to specify every field – unset fields simply take a default. Proto3 did away with required fields to make evolving schemas safer (no more breaking old code by missing a required field) and to simplify usage. (Proto3 has an optional keyword that you can use if you specifically need to detect presence/absence of a value, but under the hood this behaves like a wrapper around a boolean "has" flag. Most beginners don’t need to use optional unless you have a specific reason to know if a field was set or just defaulted.)

Repeated fields: If you want a field to be a list of values, you can mark it as repeated. For example, repeated string phone_numbers = 4; means the phone_numbers field can appear any number of times (including zero) in the Person message. In the generated code, this typically becomes a list/array type. The order of repeated elements is preserved. Think of a repeated field as a dynamic array – you can add as many entries as needed (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium) (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). For instance, you might have a Person with multiple phone numbers, so you’d use a repeated field for phone numbers. Each element in a repeated field is encoded with its tag, so if you have many repeated elements, remember that the tag will be repeated for each element in the binary format (which is why using a low tag number for a frequently repeated field can save space) (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium).

Oneof (optional): Proto3 also provides a special construct called oneof (optional reading). A oneof allows you to define a set of fields in a message, of which at most one can be set at the same time. It’s like a union type – setting one field in the oneof automatically clears the others. This is useful for mutually exclusive fields (for example, if a response message can contain data of one of several types, but never more than one at once). Oneof fields share the same tag namespace as regular fields (you must give each option in the oneof a unique tag). We won’t go in-depth on oneof here, but it’s good to know it exists for advanced use cases (Overview | Protocol Buffers Documentation).

Map fields: Proto3 supports mapping keys to values directly with a map<key_type, value_type> field. For example, map<string, int32> user_scores = 5; would define a map from strings to ints. Under the hood, map is syntactic sugar — the compiler generates a hidden message type for the map entries (with key and value fields) and a repeated field of that entry message. But for you as the developer, you can treat it as a Map<K,V> in code. Keys in maps cannot be repeated (each key is unique) and must be a scalar type (like int or string), while values can be any type (scalar, message, etc.). If you need to represent associative arrays or dictionaries, map fields are the way to go.

Enumerations (Enums)

An enum in proto3 lets you define a set of constant values for a field. Enums are useful when a field should only have one of a few predefined values. For example, you might define an enum for phone types:

enum PhoneType {
  PHONE_TYPE_UNSPECIFIED = 0;
  MOBILE   = 1;
  HOME     = 2;
  WORK     = 3;
}

This enum defines four values (the first is an unspecified/default value, which is set to 0). In proto3, the first defined enum value must be zero – this is used as the default for the enum field if it’s not set (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). You can name the zero value something like UNSPECIFIED or NONE as a convention. Subsequent enum values are numbered sequentially (or you can assign specific integer values as long as they’re unique).

In your messages, you can use the enum as a field type. For instance, in the earlier Person message, if we want a field for phone type, we could include PhoneType phone_type = 4; (or use a repeated PhoneType if a person could have multiple phone types, though in this case one type per phone number makes more sense). Enums in the generated Kotlin/Java code become enum classes or static enums, and the field will be that enum type. If an enum field isn’t set, it defaults to the first value (which we set to represent an unspecified state) (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation).

Proto3 enums have a few rules: you cannot reuse the same number for two values in the same enum (each value’s number must be unique), but you can reuse names if you mark one as an alias (though proto3 defaults to not allowing aliases unless you explicitly enable them with an option). Also, proto3 does not allow you to specify custom default values for enums (it’s always the zero value).

Packages and Imports

Larger projects often have multiple proto files. To avoid naming conflicts and to organize your messages, you can declare a package at the top of the proto file. For example:

syntax = "proto3";
package com.example.myapp;

This package declaration doesn’t affect the binary format, but it will be reflected in the generated code’s namespace (in Java/Kotlin, it helps form the Java package for the classes). By default, if you generate Java/Kotlin code, the code will use the proto package as part of the class package name. (There is also a java_package option we’ll discuss later for explicitly controlling the generated code’s package). Declaring a package is good practice to ensure your message names don’t collide with those from other projects or libraries (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation).

If your proto file uses message types that are defined in another file, you’ll need to import that file. Imports in proto files look like:

import "other_messages.proto";

This is similar to import statements in programming languages. For example, Google provides some common .proto definitions (called Well-Known Types like google/protobuf/timestamp.proto for timestamps, duration.proto for time durations, etc.). If you want to use a Timestamp message in your proto, you would add import "google/protobuf/timestamp.proto"; and then you can use google.protobuf.Timestamp as a field type (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation). The compiler needs to know where to find the imported proto files (usually you provide include paths when invoking protoc, or the Gradle plugin handles it). Each proto file should declare its own syntax and package; the imported types are referenced by their fully qualified name or by using the package name if one is defined.

In summary, use package to namespace your proto definitions, and use import to include definitions from other proto files. This is analogous to package naming and imports in Kotlin/Java source code.

Services and RPCs in Proto3

Protocol Buffers aren’t just about data structures; they can also define services for RPC (Remote Procedure Calls). This is more relevant if you are using gRPC (Google’s RPC framework) or another RPC system. In a proto file, a service is defined using the service keyword, and inside you can define RPC methods. For example:

service UserService {
  rpc GetUser(UserRequest) returns (UserResponse);
  rpc UpdateUser(UserUpdateRequest) returns (UserResponse);
}

Here we define a service UserService with two RPC methods: GetUser and UpdateUser. Each RPC has a request message type and a response message type (which you would have defined as messages elsewhere in the proto file). The proto compiler can generate service interface code or stubs in various languages if you have the appropriate plugins. For instance, with gRPC, the compiler can generate a Kotlin (or Java) interface for the server and a stub for clients to call these methods as if they were local functions.

For example, if we had a search service, we might see something like this in the proto file:

service SearchService {
  rpc Search(SearchRequest) returns (SearchResponse);
}

This indicates an RPC method Search that takes a SearchRequest message and returns a SearchResponse message (Language Guide (proto 3) | Protocol Buffers Documentation). The protoc compiler (with the gRPC plugin) would generate code to help implement this service and call it from clients.

Note: In mobile app development, it's less common to run a full gRPC server in your Android app (usually the phone is the client talking to a server). However, you can certainly use gRPC on Android to call RPC services defined by protobuf – Google provides libraries for gRPC in Android and even a Lite version for mobile. If you are not using gRPC, you can ignore service definitions entirely – they’re optional. Many .proto files for data exchange (like DataStore schemas or network message formats) do not include any service definitions; they only define messages. But it’s good to know the syntax is there in case you venture into RPC. If you do define a service and you’re using it on Android, you will need to include the gRPC Kotlin/Java library and generate gRPC stubs with the protoc-gen-grpc plugin in addition to the base protobuf classes. For this tutorial, we focus on the data (message) aspect, as services are less common in basic mobile scenarios.

Options and Custom Options

Protobuf allows you to specify options to customize code generation or alter behaviors. Options can appear at the top of the file (file options), or on message, field, enum, and service definitions. They are usually not required, but can be very useful. Here are some common options you’ll encounter:

java_package (file option): This option specifies the Java/Kotlin package to use for generated classes (Language Guide (proto 3) | Protocol Buffers Documentation). By default, if you don’t specify this, the proto package will be used to determine the package of generated Java classes. However, sometimes your proto package might not conform to Java package naming conventions or you want the generated code in a specific package. For example:
```
option java_package = "com.example.myapp.protos";
```
This will put the generated classes into com.example.myapp.protos package, regardless of what the package in the proto file is. It’s a good practice to set this to a proper Java package (especially if your proto package is something like tutorial or doesn’t match your codebase structure).
java_multiple_files (file option): By default, protoc might generate an outer class that contains all your message classes as inner classes (when multiple messages or enums are in one file). If you set option java_multiple_files = true;, it will generate each top-level message/enum as its own separate class file (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium). This option is commonly enabled because it integrates better with Java/Kotlin tooling (each message as an independent class). It’s recommended to set this to true in most cases, so your output isn’t one giant wrapper class.
java_outer_classname (file option): If you do not use java_multiple_files, protoc will wrap everything in one outer class. This option lets you specify the name of that wrapper class. For example, if your file is user.proto but you want the outer class to be UserProtos, you can set option java_outer_classname = "UserProtos";. If you use java_multiple_files = true, this option isn’t very relevant (there won’t be a single outer class) (Language Guide (proto 3) | Protocol Buffers Documentation) (Language Guide (proto 3) | Protocol Buffers Documentation).
optimize_for (file option): This option can be set to SPEED, CODE_SIZE, or LITE_RUNTIME. It influences how the code is generated especially for C++ and Java. The default is SPEED, which generates fully featured classes with all the methods (fast, but larger code). CODE_SIZE generates smaller code by using reflection for some operations (slower but reduces binary size). LITE_RUNTIME generates classes that depend on the lite version of the runtime, which strips out descriptors and reflection support for a much smaller footprint – ideal for Android/mobile (Language Guide (proto 3) | Protocol Buffers Documentation). In practice, when targeting Android, you’ll often use the Lite runtime (we’ll show how via Gradle plugin settings instead of using this option directly). For example:
```
option optimize_for = LITE_RUNTIME;
```
would generate code for the lite runtime (if you were not using the Gradle plugin’s built-in mechanism). Nowadays, the Gradle plugin’s lite option is preferred over manually putting this in the proto.
Other options: There are options to mark fields or messages as deprecated (deprecated = true on a field, so that use in code will show a deprecation warning), options specific to other languages (like csharp_namespace for C# namespace, etc.), and options to control generated service code (java_generic_services which is usually not used anymore because gRPC has its own generator) (Language Guide (proto 3) | Protocol Buffers Documentation).

In addition to built-in options, Protocol Buffers allow the definition of custom options. This is an advanced feature where you can extend the protobuf descriptors to attach custom metadata. For example, you could define a custom option that applies to fields to indicate validation rules or UI hints. Defining custom options involves writing a special proto (or using google.protobuf.descriptor to extend, which is a proto2 feature). For instance, you might see something like:

import "google/protobuf/descriptor.proto";

extend google.protobuf.FieldOptions {
  string ui_label = 51234;
}

message Person {
  string name = 1 [(ui_label) = "Full Name"];
  // ...
}

Here we created a custom field option ui_label and used it on the name field. This doesn’t affect how the message is serialized, but the generated code can access the descriptor and retrieve this metadata (or other tools that read the proto file can use it). Custom options are primarily useful if you are building tooling around protos or need annotations for code generation. For a beginner, it’s enough to know that custom options exist; using them is quite rare unless you have a specific need.

Summary: Options fine-tune the code generation and behavior. The most relevant for Android/Kotlin are java_package and java_multiple_files to organize code, and using the lite runtime for smaller code. Custom options allow extending the schema definitions with additional info, but are an advanced topic you can explore later if needed.

Now that we have an understanding of proto3 syntax and how to define our data schema, let’s move on to using these definitions in an Android project.

Compiling a Proto3 File for Android (Kotlin)

Once you have a .proto file with your message definitions, you need to compile it to generate the Kotlin (or Java) classes that you will use in your Android app. Google provides the protoc compiler for Protocol Buffers, and there are Gradle plugins that make integration with an Android/Kotlin project easier. In this section, we’ll cover how to set up your Android project to compile proto files, including installing protoc and configuring Gradle with the protobuf plugin. We’ll focus on using Gradle (the build system for Android) so that the proto compilation happens automatically as part of your project build.

You have two main approaches: using the Gradle Protobuf Plugin (which automates protoc invocation during builds), or manually running the protoc compiler. We highly recommend the Gradle plugin approach for Android projects.

Setting up `protoc` and the Gradle Plugin

Add the Protobuf Gradle Plugin to your project: In your project’s top-level build.gradle (or settings.gradle for newer Android Gradle plugin versions), include the protobuf plugin so Gradle knows about it. For example, in the build.gradle you might add: (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium) (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium)
```
buildscript {
    dependencies {
        // ... other classpaths
        classpath "com.google.protobuf:protobuf-gradle-plugin:0.8.19"
    }
}
```
(Replace 0.8.19 with the latest version of the plugin available.) If you’re using the plugins DSL, you can instead apply it in your module’s plugins {} block with: id "com.google.protobuf" version "0.8.19".
Install the Protocol Buffers compiler (protoc): The Gradle plugin can automatically download the specified version of protoc for you, so you usually don’t need to manually install anything. In your app module’s build.gradle, configure the plugin to use a specific protoc version. For example: (Working with Proto DataStore | Android Developers)
```
apply plugin: 'com.google.protobuf'  // apply the protobuf plugin at the top of the file

protobuf {
    protoc {
        artifact = "com.google.protobuf:protoc:3.21.7"  // specify protoc version
    }
    ...
}
```
This tells Gradle to fetch protoc compiler version 3.21.7 (which supports proto3 and Kotlin code generation). You can pick a recent stable version of protoc (3.21.x or newer). The protoc compiler will be used during the build to compile your .proto files.
Configure code generation options (Java vs Kotlin, lite runtime): By default, protoc will generate Java classes. Since we’re using Kotlin, we have two choices: use the Java classes directly in Kotlin (which is perfectly fine thanks to Kotlin’s Java interoperability), or generate Kotlin classes that include Kotlin-specific features. The official protoc compiler now supports generating Kotlin code (which wraps the Java generated code with some Kotlin-friendly APIs). Let’s set up both the lite runtime (for Android efficiency) and Kotlin generation. Inside the protobuf { ... } block in Gradle, add:
```
    generateProtoTasks {
        all().forEach { task ->
            task.builtins {
                java { 
                    option 'lite'  // use lite runtime for smaller code
                }
                kotlin {}        // generate Kotlin classes (requires protoc-gen-kotlin)
            }
        }
    }
```
(Securing Your Kotlin Application: Preventing Broken Auth)This configuration does a few things:
- It applies to all proto compilation tasks (usually just the main source set).
- It enables the Java lite runtime by passing the lite option. This means the generated code will use the streamlined protobuf-javalite library (which omits reflection and reduces method count) – important for Android apps to save on size (Language Guide (proto 3) | Protocol Buffers Documentation).
- It enables the Kotlin code generation (kotlin {} block). The protobuf Gradle plugin has a built-in support for Kotlin as a generation target (as of plugin version 0.8.18+ and protoc 3.17+). This will produce .kt files in addition to .java files. Under the hood, the Kotlin generated code relies on the Java classes, but provides a nicer Kotlin API (like builder DSL functions).
Add Protobuf dependencies to your module: You need to include the protobuf runtime library in your app, so that you can use the generated classes. For Android with lite runtime, add the protobuf-javalite dependency. If using Kotlin generation, also add protobuf-kotlin extension. For example, in your module’s dependencies block: (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium)
```
dependencies {
    implementation "com.google.protobuf:protobuf-javalite:3.21.7"
    implementation "com.google.protobuf:protobuf-kotlin:3.21.7"  // for Kotlin extensions (optional)
    // (plus any other dependencies like AndroidX, etc.)
}
```
The version for protobuf-kotlin should match the protobuf runtime version (here we use 3.21.7 as an example). The protobuf-javalite library contains the core classes needed at runtime (like MessageLite and parsing utilities) (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium). The protobuf-kotlin library contains Kotlin extension functions and classes that work with the lite runtime to provide Kotlin-friendly APIs. If you decided not to generate Kotlin code, you can omit protobuf-kotlin and just use protobuf-javalite. Conversely, if you want the full (non-lite) runtime (not recommended for Android), you’d use protobuf-java instead of javalite, but we’ll stick with lite here.
Organize your proto source files: By default, the protobuf Gradle plugin looks for proto files in src/main/proto/ for an Android app module. Create this directory if it doesn’t exist. For example, you might have app/src/main/proto/mydata.proto. You can also configure a different location via sourceSets if needed, but using the default src/main/proto is simplest (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium). If you have multiple modules, you can even dedicate a module for protos (as shown in some setups where they make a :model module) (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium), but for a beginner, keeping protos in the app module is fine.
Build the project: Once the above is set up, sync your Gradle project and do a build (or Rebuild Project in Android Studio). The plugin will invoke protoc to generate code from your .proto files. If everything is set up correctly, you should see generated Java and Kotlin source files under build/generated/source/proto/ in your module (the plugin will automatically include these in the compilation classpath) (Setting Up Protocol Buffers in an Android Project | by Vishwajith Shettigar | Medium) (Working with Proto DataStore | Android Developers). For example, if you had mydata.proto with a message UserPreferences, you would get a UserPreferences.java (and possibly UserPreferencesKt.kt for Kotlin usage). If you don’t see generated files, make sure your Gradle configuration is correct, and try a clean/rebuild (and check that the proto file is in the right folder).

That’s it! You’ve configured your Android project to compile protobufs. Each time you build, if the proto file changes (or on a clean build), protoc will run and update the generated classes. You can now use those classes in your Kotlin code.

Note: Alternatively, you could compile proto files manually by downloading the protoc binary from the official website and running commands like protoc --java_out=path/to/output myfile.proto. But using Gradle is far more convenient and ensures everyone working on the project or your CI environment generates the code consistently. If you ever need to generate code for other languages (for example, to share the same .proto with a backend in Go or Python), you’d run protoc with the respective plugin for those languages separately. For Kotlin/Java in Android, the Gradle plugin handles our needs.

Understanding the Generated Code

Whether you generated Java or Kotlin classes from your proto, the structure is similar. Each proto message becomes a class in the target language, with methods to get and set fields, and to serialize/deserialize. Let’s outline what you get from a proto3 message definition:

For each message (e.g., Person), protoc generates a class Person. In Java, this class typically extends GeneratedMessageLite (for lite runtime) or GeneratedMessageV3 (for full runtime). In Kotlin, if you enabled Kotlin generation, you get a corresponding PersonKt DSL class and extension functions, but conceptually you still work with the Person class.
The class contains getters (and sometimes setters or builder methods) for each field. In Kotlin, the Java getters appear as properties. For example, you can access person.name to get the name (under the hood it’s calling getName()).
A builder pattern is used to create or modify instances. The Person class will have an inner Person.Builder class (in Java) or in Kotlin you might also have a top-level function to build via DSL. Typically, you don’t directly call a constructor on these classes; instead you do Person.newBuilder(), set the fields, then call build() to get an immutable object. (The lite runtime actually might allow some direct field setting, but using the builder is the standard way).
Serialization methods: Every message class has methods to serialize to binary or parse from binary. For example, Person.parseFrom(byteArray) will construct a Person object from a ByteArray of data, and person.toByteArray() gives you a ByteArray containing the serialized form of that person (Overview | Protocol Buffers Documentation). There are also methods to write to an OutputStream (e.g., person.writeTo(outputStream)) (Overview | Protocol Buffers Documentation), or parse from an InputStream (Person.parseFrom(inputStream)), etc. These make it easy to read/write protobuf messages from files, network sockets, or other I/O.
If you had repeated fields, the generated code will use a list to represent them. For example, a repeated string phone_numbers becomes something like List<String> getPhoneNumbersList() in Java, and in Kotlin you can use it as person.phoneNumbersList (or just .phoneNumbers property depending on how the Kotlin extension maps it). You’ll also have methods to get the count (getPhoneNumbersCount()) and index into the list (getPhoneNumbers(int index)).
If you had nested message or enum types (like Person.PhoneNumber or Person.PhoneType from earlier examples), those become inner classes or enum types within the outer class (if java_multiple_files was false) or separate top-level classes (if java_multiple_files true). With java_multiple_files = true, you’d have PhoneNumber as its own generated class (likely with a name like Person.PhoneNumber in Kotlin or just PhoneNumber in Java but in the same package). The exact arrangement can be controlled with options, but the key point is you will have classes for each message and enum.
If Kotlin generation is enabled, you get some extra nice touches:
- For each message, an extension function is generated that allows a DSL style building. For example, you might have a function fun person(block: PersonKt.Dsl.() -> Unit): Person in Kotlin. This lets you create a Person in a Kotlin-idiomatic way like:
  val p = person { id = 1234 name = "John Doe" email = "[email protected]" }
  Under the hood, this is using a PersonKt.Dsl builder class where each field is a var that you can assign within the lambda (Securing Your Kotlin Application: Preventing Broken Auth) (Kotlin Generated Code Guide | Protocol Buffers Documentation). When the block is done, it builds the Person. This is equivalent to using the Person.newBuilder() approach, but often more concise in Kotlin.
- Kotlin extension functions also provide operator overloads for repeated fields (so you can do something like person { phoneNumbers += "123-4567" } perhaps) (Kotlin Generated Code Guide | Protocol Buffers Documentation) and other conveniences. All these are built on top of the core Java implementation.

If you did not enable Kotlin codegen, you will still use the Java-generated classes in your Kotlin code. That’s absolutely fine: Kotlin can interact with Java classes seamlessly. You would just use the builder methods or static methods in a slightly more Java-like style. For example, to create a Person you might do:

val personBuilder = Person.newBuilder()
personBuilder.id = 1234            // in Kotlin, you can set like a property thanks to @JvmField properties or use setId()
personBuilder.setName("John Doe")
personBuilder.email = "[email protected]"
val person = personBuilder.build()

Or chaining it fluently:

val person = Person.newBuilder()
    .setId(1234)
    .setName("John Doe")
    .setEmail("[email protected]")
    .build()

Either way, you end up with a Person object with those fields set.

The generated classes also include some other methods:

getDefaultInstance(): a static method that returns a default instance (all fields unset/default). Useful if you need a baseline or to compare against.
newBuilder(existingMessage): to create a builder pre-populated with an existing message’s data (to modify a copy of a message).
For each field, there might be a hasField() method in proto3 only if the field is an optional or in a oneof (because normally proto3 doesn't track presence of primitive fields). And for oneof, it generates a case enum and getCase() methods to see which field is set.
Enum classes for enum types, with values like PhoneType.MOBILE etc. There’s usually an UNRECOGNIZED value as well to handle unknown numeric values that aren’t in the enum (proto3 enums are open-ended for forward compatibility).

One thing to note: The classes generated by the lite runtime do not include the full reflective descriptor API. They are leaner. So you won't have methods like Person.getDescriptor() in lite, but that’s rarely needed in application code.

Using Protobuf Classes in an Android App (Serialization & Deserialization)

Now that you have the generated data classes, using them in an Android app is straightforward. You treat them like you would any model objects, with the added capability to efficiently serialize/deserialize them.

Creating and populating a message: As shown above, use the builder or Kotlin DSL to set fields. For example:

// Using builder to create an instance
val newUser = User.newBuilder()
    .setId(42)
    .setName("Alice")
    .setEmail("[email protected]")
    .build()

If you enabled Kotlin DSL generation, you can do the same more idiomatically:

val newUser = user {  // 'user' is a generated top-level function for User message
    id = 42
    name = "Alice"
    email = "[email protected]"
}

Both approaches produce an immutable User message object with the fields set.

Reading data (parsing): Suppose you received a ByteArray from a network response or you read from a file that contains a serialized User message. You can parse it into an object:

val userBytes: ByteArray = ... // bytes from somewhere (network, file, etc)
val user = User.parseFrom(userBytes)

This will throw an InvalidProtocolBufferException if the data is not a valid serialization of User. In many cases, you might wrap parsing in a try-catch to handle corrupted data (Working with Proto DataStore | Android Developers). If you’re reading from a stream:

val inputStream: InputStream = ... 
val user = User.parseFrom(inputStream)

Similarly, there are overloads like User.parseDelimitedFrom(stream) if you are reading length-delimited messages from a stream (useful if multiple messages are concatenated). In Android, if you use DataStore (as an example), the library will use parseFrom(input) under the hood to read your data object from disk (Working with Proto DataStore | Android Developers).

Writing data (serializing): To send or store a message, you convert it to bytes:

val data: ByteArray = user.toByteArray()

Now you can, for example, write this byte array to a FileOutputStream or send it over a network socket. If using an OutputStream directly, you can also do user.writeTo(outputStream) (Overview | Protocol Buffers Documentation). The advantage of writeTo is that it writes in a streaming fashion (useful for very large messages), but for most purposes, toByteArray is fine.

These messages can be easily transmitted between app components or over the network. For instance, if you wanted to pass a protobuf message between Android Activities or Services, you could put the byte array in an Intent extra. (There’s no built-in Android Parcelable for protos, but you can always serialize to bytes for intent extras or savedInstanceState).

Integrating with networking: If you have a backend service that speaks protobuf (e.g., a gRPC service or a custom HTTP endpoint expecting protobuf), you can take the ByteArray from toByteArray() and send it in the request body. Ensure you set the appropriate content type (e.g., application/octet-stream or application/x-protobuf) if using HTTP. On response, you’d parse the bytes back into a message using parseFrom. If you use gRPC, the gRPC library will handle calling these under the hood, presenting you with generated stub methods that accept and return the message classes.

Using JSON (if needed): Proto3 has a standard JSON mapping, and the Java/Kotlin library includes util classes to convert protos to/from JSON (using JsonFormat class). This is handy if you need to, say, send a protobuf message to a web service that expects JSON, or for debugging. However, using JSON loses the size/speed benefits, so typically you’d only use it when necessary.

Equality and other methods: Protobuf message classes come with .equals() and .hashCode() implemented (they compare field values), and .toString() which gives a human-readable debug string of the message (useful for logging).

At this point, you’ve defined a schema and generated code, and you know how to create, populate, serialize, and parse those messages in your Android app. Let’s put it all together with a concrete example.

Real-World Example: Using Proto3 in an Android Kotlin Project

To make things concrete, let’s walk through a real-world example. Imagine we are building a simple address book feature in an Android app. We want to store contact information (people’s names, IDs, emails, and phone numbers) and perhaps send this data over the network or save it locally. We’ll define this data with protobuf, compile it, and use it in Kotlin code.

The `.proto` Schema

We create a file addressbook.proto in our project (under app/src/main/proto/). Here’s what it might look like:

syntax = "proto3";
package com.example.addressbook;    // Proto package for our messages

option java_package = "com.example.addressbook.proto";  // package for generated code
option java_multiple_files = true;                     // generate separate classes for each message

message Person {
  int32 id = 1;               // Unique ID for the person
  string name = 2;
  string email = 3;

  message PhoneNumber {
    string number = 1;
    PhoneType type = 2;
  }

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}

Let’s break down what we have here:

We declared syntax = "proto3"; and a package com.example.addressbook. We also set java_package to com.example.addressbook.proto (this means our generated Kotlin/Java classes will be in that package) and java_multiple_files = true so that Person, Person.PhoneNumber, Person.PhoneType, and AddressBook each become their own class (Understanding Protocol Buffers for Android Development | by BHAVNA THACKER | Medium).
We have a Person message with fields:
- id (an integer identifier),
- name (string),
- email (string),
- phones (a repeated field of PhoneNumber messages).
Inside Person, we defined a nested message PhoneNumber with a number (string phone number) and a type (using an enum).
The PhoneType enum is nested in Person and defines three possible types of phone numbers.
Finally, an AddressBook message contains a list of Person entries.

This schema allows us to represent an address book with multiple people, each of whom can have multiple phone numbers. It’s similar to examples you’ll find in official protobuf tutorials (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation) (Protocol Buffer Basics: Kotlin | Protocol Buffers Documentation).

After adding this proto file, we rebuild the project. The protobuf compiler will generate Kotlin/Java classes for Person, Person.PhoneNumber, Person.PhoneType, and AddressBook.

Generated Kotlin Classes (Overview)

With java_multiple_files = true, we expect separate classes. Key classes generated include:

Person – a class with fields id, name, email, and phones. It will have methods like getId(), getName(), etc., and List<Person.PhoneNumber> getPhonesList(). In Kotlin, these appear as properties (person.id, person.name, person.phonesList, etc.). It also has a nested builder class and static methods like parseFrom and newBuilder().
Person.PhoneNumber – this will be an inner static class (or a top-level class in the same package) representing the phone number message. It has number and type fields with their accessors.
Person.PhoneType – an enum (in Java it might be an enum class). In Kotlin, this might appear as an enum class PhoneType { MOBILE, HOME, WORK, UNRECOGNIZED }. The UNRECOGNIZED is often generated to handle unknown enum values. You use it like Person.PhoneType.MOBILE in code.
AddressBook – a class with one field which is a list of Person (List<Person> people and related methods).

If Kotlin generation was enabled, you would also have some Kotlin-specific files. For example, there may be a PersonKt.kt with a Dsl class and a top-level function person {...} to build Person, and similarly for AddressBook (maybe a addressBook {...} builder function). For simplicity, you can also use the builder classes directly in Kotlin.

We won't list the entire generated code (it’s quite verbose), but it essentially provides the API we described earlier. The main thing is that we now have classes we can use in our app code as if we wrote them by hand – but they’re optimized and tested, and they implement the serialization logic for us.

Using the Generated Classes in Android (Kotlin) Code

Now, let’s use these classes. Suppose we want to create a new AddressBook, add a Person to it, and then serialize it (maybe to send to a server or save to a file):

// Create a Person object using the builder
val alice = Person.newBuilder()
    .setId(1001)
    .setName("Alice")
    .setEmail("[email protected]")
    .addPhones(               // add a phone number
        Person.PhoneNumber.newBuilder()
            .setNumber("555-1234")
            .setType(Person.PhoneType.MOBILE)
            .build()
    )
    .addPhones(               // add another phone number
        Person.PhoneNumber.newBuilder()
            .setNumber("555-5678")
            .setType(Person.PhoneType.HOME)
            .build()
    )
    .build()

// Create an AddressBook and add the person to it
val addressBook = AddressBook.newBuilder()
    .addPeople(alice)
    .build()

// Serialize the AddressBook to bytes (e.g., to save to a file or send over network)
val data: ByteArray = addressBook.toByteArray()

// ... (imagine we send this data or write to disk)

// Deserialize the bytes back into an AddressBook object
val receivedAddressBook = AddressBook.parseFrom(data)
for (person in receivedAddressBook.peopleList) {
    Log.d("ProtoExample", "Person: id=${person.id}, name=${person.name}, email=${person.email}")
    for (phone in person.phonesList) {
        Log.d("ProtoExample", "  Phone: ${phone.number} (${phone.type})")
    }
}

Let’s walk through this code:

We used Person.newBuilder() to construct a Person for Alice. We set her id, name, and email using the builder’s setter methods. Then we used .addPhones(...) to add two phone numbers. Notice we created each phone number by calling Person.PhoneNumber.newBuilder(), setting its fields, and building it. We could have reused the builder for multiple phone numbers, but here we just build inline.
After setting up the builder, we call .build() to get an immutable Person object. Now alice is an instance of Person that we can use.
We then create an AddressBook via its builder, and we add alice to it with .addPeople(alice). If we had multiple Person objects, we could call .addPeople multiple times or .addAllPeople(listOfPeople).
We build the AddressBook. Now addressBook contains our data.
We call addressBook.toByteArray() to serialize the entire address book to a ByteArray. This binary data can be, for example, written to a file:
```
FileOutputStream(filePath).use { it.write(data) }
```
or sent over a network socket or HTTP body.
To read the data back, we use AddressBook.parseFrom(data) which returns a new AddressBook instance populated with the data from the byte array. We then iterate over the entries and log them. Accessing fields is easy: person.name etc. (In Java it would be person.getName()).
The output logs would show Alice’s info and her phone numbers. The enum phone.type when printed will show the enum name (MOBILE or HOME).

This example illustrates how you can move data between your app and storage/network seamlessly using protobuf. You didn’t have to write any serialization code – the generated classes did it for you, and they ensure that the data is compact.

A note on compatibility: If you later update the addressbook.proto (say you add a new field string address = 5; to Person), old data (without that field) can still be parsed – the new field will just have its default value (empty string) when you parse old data. And if new data with the field is parsed by old code (that doesn’t know about field 5), the old code will simply ignore the unknown field. This makes protobuf suitable for apps that might interact with servers of different versions or need to preserve data across app upgrades (Overview | Protocol Buffers Documentation).

Where to Go Next

In this tutorial, we covered:

What Protocol Buffers are and why they can be useful in Android (e.g., for DataStore, efficient network communication, etc.).
The syntax of proto3 files: defining messages, fields, enums, packages, and more.
Setting up an Android Studio project to compile proto files using Kotlin (Gradle configuration and protoc plugin).
Understanding the generated code and how to use it to serialize/deserialize data in an Android app.
A concrete example demonstrating a proto schema and using it in code.

With this knowledge, you can start defining your own protobuf schemas for your apps. A common use in Android is with Jetpack DataStore (Proto DataStore), where you define a proto for your structured preferences and let DataStore handle reading/writing it. Another use might be if your app communicates with a backend via gRPC or if you have a custom binary protocol between devices.

Tip: Always keep your .proto files under version control and in sync with any other systems that use them (like your server). The .proto is essentially a contract for your data. Changes should be made carefully (e.g., avoid reusing field numbers or removing fields abruptly).

Proto3 also has support for well-known types (like Timestamp, Duration, etc. from google.protobuf package), which you can use to avoid reinventing common structures. And if you need to interoperate with JSON, you can use the JSON format utilities provided by the library.

By using Protocol Buffers in your Android Kotlin project, you gain the benefits of a strongly typed schema, efficient binary serialization, and easy integration across different components of your app and even different programming languages. It might feel like an upfront investment to define schemas, but for many applications the reliability and performance pay off. Happy coding with proto!

(Overview | Protocol Buffers Documentation) (Working with Proto DataStore | Android Developers)

PreviousCommunication Mechanisms NextIdentifying the build system

Last updated 8 months ago