Sunday, 20 July 2014

Java - Serialization, Deserialization and Externilization

This is the main feature of core java. Serialization means persisting object state to filesystem or DB. Object state convert into a binary format and persisted.
Primary purpose of java serialization is to write an object into a stream, so that it can be transported through a network and that object can be rebuilt again. When there are two different parties involved, you need a protocol to rebuild the exact same object again. 
If you want a class object to be serializable, all you need to do it implement the java.io.Serializable interface. Serializable is a marker interface (tag interface) and has no fields or methods to implement.
What is the means of marker interface and what this interface will do if doesn't have any method and no field??? 
Java's main goal is security, hence JVM does not allow object to persisted  or transported directly. So, if JVM doesn't allow then how we can serialize object. But don't worry Java has provided way to serialize object by using Marker interface.Marker interface sends signal to JVM and inform that object should not be blocked by serialization operation.So, marker interface is only used by JVM. Some more marker interface is Remote, Clonnable.
If you don't implement Serializable interface then during serialize object you will get NotSerializableException’ will be thrown at runtime.
Serialization process is implemented by ObjectInputStream and ObjectOutputStream, so all we need is a wrapper over them to either save it to file or send it over the network. Let’s see a simple Serialization example.
public class Employee implements Serializable {
  private static final long serialVersionUID = -6470090944414208496L;
     
    private String name;
    private int id;
    transient private int salary;
    //setter &getter method
     
    @Override
    public String toString(){
        return "Employee{name="+name+",id="+id+",salary="+salary+"}";
    }
 }
 Above Employee class has name,id and salary attribute and setter/getter method. lets create SerializeTest class which will serialize the Employee class.

class SerializeTest{
    public static void main(String args[]){
       Employee emp = new Employee("1", "Rajesh" , "10000");
       try { 
            FileOutputStream fileOut = new FileOutputStream("/tmp/employee.ser"); 
            ObjectOutputStream out = new ObjectOutputStream(fileOut);  
            out.writeObject(e);  
            out.close(); 
            fileOut.close(); 
            System.out.printf("Serialized data is saved in /tmp/employee.ser"); 
          }catch(IOException i) { i.printStackTrace(); }
     }
}
you have seen above we have serialized the Employee object in file employee.ser
Now same object we can get from file employee.ser, getting back same object from file or DB is called Deserialization.
Lets see how to deserialize the same object by below example -

public class DeserializeExample {  
           public static void main(String [] args) { 
                  Employee e = null;  
                   try {  
                       FileInputStream fileIn = new FileInputStream("/tmp/employee.ser");  
                       ObjectInputStream in = new ObjectInputStream(fileIn); 
                       e = (Employee) in.readObject(); 
                       in.close(); 
                       fileIn.close(); 
                     }
                     catch(IOException i) { i.printStackTrace(); return; }
                     catch(ClassNotFoundException c) { System.out.println("Employee class not found");}
                    System.out.println("Deserialized Employee..."); 
                    System.out.println("Name: " + e.name); 
                    System.out.println("Salary: " + e.salary);
             }  
     }

Output -
Deserialized Employee...
Name: Rajesh
Salary: 0

You can see Employee class has salary attribute and when we get deserialize output it's value is 0, why this is??? because this is defined as a transient in Employee class.So, we can say that if we want to avoid the state to be serialize use transient keyword. Similarly static variable values are also not serialized since they belongs to class and not object.java class should have serialVersionUID defined for the class.if the class doesn’t define serialVersionUID, it’s getting calculated automatically and assigned to the class. 
Java uses class variables, methods, class name, package etc to generate this unique long number. If you are working with any IDE, you will automatically get a warning that “The serializable class Employee does not declare a static final serialVersionUID field of type long”.We can use java utility “serialver” to generate the class serialVersionUID.It just need to be there to let deserialization process know that the new class is the new version of the same class and should be deserialized of possible. So, SerialVersionUID is an ID which is stamped on object when it get serialized usually hashcode of Object, you can use tool serialver to see serialVersionUID of a serialized object . SerialVersionUID is used for version control of object. you can specify serialVersionUID in your class file also.  Consequence of not specifying serialVersionUID is that when you add or modify any field in class then already serialized class will not be able to recover because serialVersionUID generated for new class and for old serialized object will be different. Java serialization process relies on correct serialVersionUID for recovering state of serialized object and throws  
java.io.InvalidClassException in case of serialVersionUID mismatch.


Serialization process, is done automatically. Sometimes we want to obscure the object data to maintain it’s integrity. We can do this by implementing java.io.Externalizable interface and provide implementation of writeExternal() and readExternal() methods to be used in serialization process.
public class Employee implements Externalizable{
    private int id;
    private String name;
     
    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
        out.writeInt(id);
        out.writeObject(name+"xyz");
    }
    @Override
    public void readExternal(ObjectInput in) throws IOException,
            ClassNotFoundException {
        id=in.readInt();
        name=(String) in.readObject();
        if(!name.endsWith("xyz")) throw new IOException("corrupted data");
        name=name.substring(0, name.length()-3);  
    }
    @Override
    public String toString(){
        return "Person{id="+id+",name="+name+"}";
    }

So, we can use below code to serialize and deserialize object-

    try {
            FileOutputStream fos = new FileOutputStream(fileName);
            ObjectOutputStream oos = new ObjectOutputStream(fos);
            oos.writeObject(person);
            oos.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
         
        FileInputStream fis;
        try {
            fis = new FileInputStream(fileName);
            ObjectInputStream ois = new ObjectInputStream(fis);
            Employee emp = (Employee)ois.readObject();
            ois.close();
            System.out.println("Employee Object Read="+emp);
        } catch (IOException | ClassNotFoundException e) {
            e.printStackTrace();
        }
So, you have seen we can control more if we use Externalizable interface even we can avoid some of field to be serialize without using transient.
Sometimes we need to extend a class that doesn’t implement Serializable interface. If we rely on the automatic serialization behavior and the superclass has some state, then they will not be converted to stream and hence not retrieved later on.

No comments:

Post a Comment