Magento DataFlow - Data Exchange Made Flexible [Part 1]

One of major features of e-commerce websites is the possibility to share data with offline sale management systems. Magento made data exchange flexible and quite easy with DataFlow module.

Magento DataFlow is a data exchange framework that use four types of components: adapter, parser, maper and validator. At current state of developement validators are not implemented, but are reserved for future use.

Dataflow of data exchange process is defined as XML structure and called profile. Magento provides simple wizard-like tool for generation of some basic import/export profiles operating on products or customers entities. Advanced profiles manager is also provided for advanced users able to create XML defining profile without wizard tool and with need to use more custom dataflow operations related also to other entities.

Adapters are responsible for pluging into an external data resource and fetching requested and filtered data. It can be used for example to get data from: local or remote file, web services, database and more.

For example to load data from csv file you can put in XML profile the following code:

<action type="dataflow/convert_adapter_io" method="load">
    <var name="type">file</var>
    <var name="path">var/import</var>
    <var name="filename"><![CDATA[products.csv]]></var>
    <var name="format"><![CDATA[csv]]></var>

To load data from remote FTP server you can use same adapter, but with these parameters:

<action type="dataflow/convert_adapter_io" method="load">
    <var name="type">ftp</var>
    <var name="host"><![CDATA[]]></var>
    <var name="passive">true</var>
    <var name="user"><![CDATA[user]]></var>
    <var name="password"><![CDATA[password]]></var>
    <var name="path">var/import</var>
    <var name="filename"><![CDATA[products.csv]]></var>
    <var name="format"><![CDATA[csv]]></var>

Parsers are responsible for transforming one data format to another. It can be used for example to convert CSV file content to two-dimmensional array, or opposite.

To parse CSV file content into database product entities you can use this code in your profile:

<action type="dataflow/convert_parser_csv" method="parse">
    <var name="delimiter"><![CDATA[,]]></var>
    <var name="enclose"><![CDATA["]]></var>
    <var name="fieldnames">true</var>
    <var name="store"><![CDATA[0]]></var>
    <var name="number_of_records">1</var>
    <var name="decimal_separator"><![CDATA[.]]></var>
    <var name="adapter">catalog/convert_adapter_product</var>
    <var name="method">parse</var>

Adapter defined within parser variables as <var name="adapter"> and adapter's method <var name="method"> are responsible for parsing loaded data. In this particular case parser converts data from CSV file content to two-dimmensional array and calls the adapter's method "parse" to process it.

The simplest way of import customization is creating own adapter given as variable within parser definition. In most cases you will need to overwrite one of existing adapters and modify or write your own parsing method (in most cases it will be overwrited saveRow() method)

Mappers are responsible for altering data values from one to another. These are useful for maping one field to another.

In example below source's 'reference' column is mapped into 'sku' column and variable '_only_specified' is set to true, so imported/exported will be only listed columns:

<action type="dataflow/convert_mapper_column" method="map">
    <var name="map">
        <map name="sku"><![CDATA[reference]]></map>
        <map name="name"><![CDATA[name]]></map>
        <map name="price"><![CDATA[price]]></map>
        <map name="qty"><![CDATA[qty]]></map>
    <var name="_only_specified">true</var>

This is just the tip of the iceberg of possibilities Magento DataFlow module offers. Come back later to read more.

5 comments on "Magento DataFlow - Data Exchange Made Flexible [Part 1]"

Vinai's picture
Vinai (visitor) - Wed, 16/09/2009 - 06:43:

Information on DataFlow currently available is very limited, so thanks for this great post!
Looking forward to the more advanced DataFlow possibilities in the coming posts!

Hunter's picture
Hunter (visitor) - Wed, 16/09/2009 - 15:20:

Killer article on breaking this down for everyone! I was just curious as to what the following is/does/and if there are any other parameters that can be used for it. Or am I just not seeing the obvious? :)


Thanks again :)

Mike D's picture
Mike D (visitor) - Mon, 26/10/2009 - 18:06:

I like the article. This is probably a dumb question that I could answer if I spent a little time digging into it, but here goes. Can Data Flow be used to import/update items made for a custom module(I assume it can)? If so are there any good examples? I have created a vehicle filter module(for an auto parts store) that allows customers to limit products they see by selecting there vehicle(make, model, options, and year). I would like to use data flow to update the fields for these tables. Is this a feasible idea? Any suggestions are greatly appreciated.

Lucjan Wilczewski's picture
Lucjan Wilczewski - Tue, 27/10/2009 - 11:05:

@Mike D
Of course you can use dataflow for custom modules. Basically in the examples above dataflow/convert_adapter_io will read any file, dataflow/convert_parser_csv will parse any data from csv file, and the only part that tells rows of data are going to be used to populate products are the adapter and method variables defined within dataflow/convert_parser_csv. You can create your custom adapter with for example saveRow method and within this method create new objects of whatever class you wish.

Lucjan Wilczewski's picture
Lucjan Wilczewski - Tue, 01/12/2009 - 12:25:

More detailed answer on subject of custom modules dataflows: