Magento Dataflow - standard parsers and mapping values [part 4]

As promised in Magento Dataflow - Default Adapters [Part 2] today I will write about standard parsers in Magento DataFlow module and mapping values with mappers.

  1. Parser definition

    Parsers are responsible for transforming data from. Parser's interface Mage_Dataflow_Model_Convert_Parser_Interface defines two methods required in each parser: parse() and unparse(). Definition of parser within profile's xml can be as simple as:

    <action type="dataflow/convert_parser_serialize" method="parse" />

    Similar to adapter we define action tag with two attributes: type, which tells which class we want to use and this class's method we want to call. We can also call parser passing variables within action tag body as you will see below.

  2. Standard parsers

    Magento DataFlow includes few standard parsers which you can find in app/code/core/Dataflow/Model/Convert/Parser.

    The simplest of standard parsers is dataflow/convert_parser_serialize (Mage_Dataflow_Model_Convert_Parser_Serialize) which doesn't require any variables passed. It requires though that any of previous actions set data within profile's container. Method parse() unserialize data stored within profile's container and replace it with the result. Method unparse() do the opposite, so it serializes data stored within profile's container and replace it with the result.

    One of most often used standard parsers is dataflow/convert_parser_csv which allows transforming from (with method parse()) or to (with method unparse()) CSV file. Example of definition:

    <action type="dataflow/convert_parser_csv" method="parse">
        <var name="delimiter"><![CDATA[,]]></var>
        <var name="enclose"><![CDATA["]]></var>
        <var name="fieldnames">true</var>
        <var name="store"><![CDATA[0]]></var>
        <var name="decimal_separator"><![CDATA[.]]></var>
        <var name="adapter">catalog/convert_adapter_product</var>
        <var name="method">parse</var>

    This parser requires that you call some io adapter prior to its execution (using for example dataflow/convert_adapter_io to read some csv file) if you want to call method parse. If you want to store data into CSV file you have to do both - call any action that will set data within profile's container prior to parser execution and call io adapter after parser execution to store data within file.

    Following variables will allow you to customize csv file parsing:

    • delimiter - defines delimiter used in csv file; defaults to comma (,) character
    • enclose - defines what character is used to enclose data values; defaults to empty character
    • escape - defines escape character for csv file; defaults to \\
    • decimal_separator - defines decimal separator sign
    • fieldnames - if set to true, it is assumed first row of csv file contains field names; if set to false map variable is used
    • map - defines fieldnames for files where first row doesn't contain fieldnames; to see how to define a map take a look at section of this article related to mapping values
    • adapter - tells which adapters method should be called on each row
    • method - tells which method of adapter should be called on each row; defaults to saveRow

    All variables defined within parser's action body are passed to the defined adapter, so if you need to pass something to it, you can simply set required variable within parser's action body.

    Last of standard parsers included within DataFlow module is dataflow/convert_parser_xml_excel (Mage_Dataflow_Model_Convert_Parser_Xml_Excel), which converts data from and to Excel xml file. Example of definition:

    <action type="dataflow/convert_parser_xml_excel" method="unparse">
        <var name="single_sheet"><![CDATA[products]]></var>
        <var name="fieldnames">true</var>

    Use requirements are the same as for dataflow/convert_parser_csv.

    Following variables will allow you to customize csv file parsing:

    • fieldnames - if set to true, it is assumed first row of csv file contains field names; if set to false map variable is used
    • map - defines fieldnames for files where first row doesn't contain fieldnames
    • single_sheet - tells if parsed should be one sheet or all; should contain name of the sheet to be parsed
    • adapter - tells which adapters method should be called on each row
    • method - tells which method of adapter should be called on each row; defaults to saveRow
  3. Standard customer and product entity parsers

    For most commonly exchanged entities - customer and product - Magento provides also standard parsers: customer/convert_parser_customer (Mage_Customer_Model_Convert_Parser_Customer) and catalog/convert_parser_product (Mage_Catalog_Model_Convert_Parser_Product). Both inherit from Mage_Eav_Model_Convert_Adapter_Entity.

    Since standard adapter's load() methods calls result with array of solely entities' id values it is required to call parser's unparse method, if we want to get more related data. Both parsers take this arrays and for each entity parse its data variable content, ignore system fields, objects, non-attribute fields and create an associative array from the rest. Additionally product parser add to the array result of parsing product related stock item object, and customer parser - result of parsing shipping and billing addresses and information about newsletter subscription.

    Both entities parsers have deprecated parse() methods, since their function is now mostly done by parser actions with standard adapter methods called within parser's context. Example of product parser definition, parsing only products from selected store:

    <action type="catalog/convert_parser_product" method="unparse">
        <var name="store"><![CDATA[1]]></var>

  4. Mapping values

    DataFlow module provides also a mapper concept - class with map() method that is responsible for mapping processed fields from one to another. The definition of mapper looks like that for example:

    <action type="dataflow/convert_mapper_column" method="map">
        <var name="map">
            <map name="category_ids"><![CDATA[categorie]]></map>
            <map name="sku"><![CDATA[reference]]></map>
            <map name="name"><![CDATA[titre]]></map>
            <map name="description"><![CDATA[description]]></map>
            <map name="price"><![CDATA[prix]]></map>
            <map name="special_price"><![CDATA[special_price]]></map>
            <map name="manufacturer"><![CDATA[marque]]></map>
        <var name="_only_specified">true</var>

    Again we have action tag with two attributes: type set as mapper class alias and method that is called to do the mapping. Mapper dataflow/convert_mapper_column is a standard mapper you can find in Magento DataFlow module within app/code/core/Dataflow/Model/Mapper/ folder, and its purpose is to map one array into another with changing the name and posibility to limit fields in result. Map's tag attribute name tells which field name should be replaced in new array by field named like the content of map's tag. If named field doesn't exist in source array, value for target's array field is set to null. Variable _only_specified tells if only fields specified in map definition should be in the resulting array.

This article would be the one that close standard features of DataFlow module and basics of its usage.

Magento widgets

Can't wait to see stable Magento 1.4 . One of the news I am so looking forward are widgets.

What are these you ask?

Widgets allow you to add informational and marketing content on your site easily in administrator panel. For example you will be able to insert links to products, links to categories or links to cms pages, add CMS Blocks, add new, recently compared, recently viewed product lists - and all this in nice and simple way directly in backoffice. You will not need any programming knowledge to add dynamic content on your store pages.

Widgets can be inserted directly on CMS pages:

        Add widget

All you need to do is choose widget type...

        widget insertion

...and see it on your CMS page.

Or you can put widgets on other (not Cms) pages for example cart, product or category page. For that choose from top admin menu CMS > Widgets and click "Add New Widget Instance" button.

Add instance button

Choose the one you want and click continue button. Then you can to specify where widget will be displayed by creating layout updates:


That is it, your widgets are exactly where you need them.

Baobaz at Ad:Tech New-York!

Last week, I attended the Ad-Tech show in New-York, which is one of the biggest event for digital marketers in North-America. More than 200 speakers, 250 exhibitors and 10,000 visitors exchanged for 3 days about the latest trends in web marketing. It was a great opportunity for Baobaz to present ad'opt - our groundbreaking 'Long Tail PPC' technology - to the cutting-edge online marketers.

Conference Ad:Tech New-York 2009


Many topics were adressed during the conferences including:

  • how brands can benefit from online advertising?
  • what perspectives for traditional media in the digital age?
  • the focus on high-quality content to create value for the Internet users
  • mobile advertising issues (do you want to understand why Google took over AdMob a few days ago?)
  • how to deal with emerging markets such China (700,000 mobile subscribers), India and Latin America

Among the exhibitors, some tackled forefront issues of our industry such as email marketing efficiency, augmented reality, Twitter tracking systems, sponsored comments on blogs and forums, legal concerns about advertisers paying bloggers for reviews or DIY video ad.


Some big players like Facebook or Google were also very active during the event since they organized specific workshops about their advertising services. The "Google Ads Factory Tour" was a comprehensive overview of Google's latest measurement and optimization tools, which was all the more interesting for a AdWords Qualified Company such as Baobaz as the Mountain View firm has recently released new amazing features for AdWords and Analytics.

It was definitely the place to be for digital marketers!

Javits Convention Center Ad:Tech New-York 2009

Varien at Baobaz

Roy Rubin (CEO) and Yoav Kutner (CTO) from Varien came to visit our parisian office for a studious workshop... Great synergies coming!

Varien at Baobaz: Frédéric Lézy, Roy Rubin, Bertrand Fredenucci, Olivier Ouin, Yoav Kutner, Benjamin Bellamy, Arnaud Ligny
Varien at Baobaz: Roy Rubin, Bertrand Fredenucci, Yoav KutnerVarien at Baobaz: Roy Rubin, Yoav Kutner
Varien at Baobaz: Bertrand Fredenucci, Roy RubinVarien at Baobaz: Yoav Kutner, Olivier Ouin

Magento orders: states and statuses

Magento orders have different states for following their process (billed, shipped, refunded...) in the order Workflow. These states are not visible in Magento back office. In fact, it is orders statuses that are displayed in back office and not their states.

Each state can have one or several statuses and a status can have only one state. By default, statuses and states have often the same name, that is why it is a little confusing. Here is the list of statuses and states available by default.

State code State name Status code Status name
new New pending Pending
pending_payment Pending Payment pending_paypal
Pending PayPal
Pending Amazon Simple Pay
processing Processing processing Processing
complete Complete complete Complete
closed Closed closed Closed
canceled Canceled canceled Canceled
holded On Hold holded On Hold

For adding new status to a state, you just need to declare it in config.xml file

                <!-- Statuses declaration -->
                    <my_processing_status translate="label"><label>My Processing Status</label></my_processing_status>
                <!-- Linking Status to a state -->
                            <my_processing_status />

When we want to modify order status in some code, we have to be sure that current order state allows status wanted. It is possible to change both state and status with setState method

$order = Mage::getModel('sales/order')->loadByIncrementId('100000001');
$state = 'processing';
$status = 'my_processing_status';
$comment = 'Changing state to Processing and status to My Processing Status';
$isCustomerNotified = false;
$order->setState($state, $status, $comment, $isCustomerNotified);

$status can also take false value in order to only set order state, or true value for setting status by taking first status associated to this state.

You can now adjust as you wish your order workflow in Magento.

Magento Backoffice (Admin Panel) Options - [Part 2]

Now, when you are a bit familiar with Magento "Admin Panel" options (see Magento Backoffice (Admin Panel) Options - [Part 1]), let's do something creative. You remember that the first part said that you can add your own menu entries? It's very useful if you want to control your module (of course the new menu entry doesn't have to manage a new module, it can be used to handle existing functionality as well). So - let's see how to do it.

First of all, you need to create a new module. Alternatively, you can download source code of a module which I created for the purpose of this article and to which I am going to refer. The module will display current time, and it's format will be set by you. The format should be a valid argument of PHP date() function.

Adding entry to menu takes place in config.xml file. What we need are the folowing lines:

        <example translate="title" module="adminhtml">
            <title>Set Time Format</title>
                    <title>Set It!</title>

This code will add entry labelled Set Time Format with subentry Set It! and it refers to module identified by example, controller index, action index. Basically, that's all that is necessary to put our new entry into the menu. Check the result in the backoffice - the new entry should appear between Catalog and Customers. If not - clear the cache.

The rest of the module are just files defining form used to enter time format and controller responsible for displaying the form or saving data. As you can see, I decided to store the data along with Magento config values. It was just a quick, dirty solution, to avoid creating more than what is essential in this short how-to.

You can also add your entry on System->Configuration page. It needs a bit diferent approach and this topic will be covered in the next part.

Increase the quality of the photos on Magento

I was recently confronted with problems of quality photos on Magento.

At first, I vainly sought an option to resolve this with the Magento backend... I finally get to the obvious: this is not possible (at least until version 1.3).
I then studied the source code to determine how the framework manages and generates mainly the different sizes of pictures (on the page list, thumbnails, etc..) products.

I discovered that Magento uses GD2, with a quality setting to 80% by default (not changeable via configuration, back-office or XML). A value 80/100 quality is good in most cases. Nevertheless in e-commerce have known that a very good picture quality can make the difference.

The idea is to push the compression quality (jpeg) to 90%, the possible options are:

  1. modify the PHP code found above: No, you should never directly edit the code from the core of Magento! (just as one shouldn't cross the streams.)
  2. create a module to administer the value of the compression quality via the back office: interesting, but reusable too long to achieve. I pass! :-)
  3. override the code: easy, fast and clean: the solution I have chosen

Change the compression quality photos Magento:

  1. Step 1

    Copy the file "/ lib/Varien/Image/Adapter/Gd2.php" to "/ app/code/local/Varien/Image/Adapter/Gd2.php" by creating the missing directories if necessary.

  2. Step 2

    Open the file Gd2.php (copy, not original) at about line 80 and substitute:

    call_user_func($this->_getCallback('output'), $this->_imageHandler, $fileName);


    if ($this->_fileType === IMAGETYPE_JPEG) {
        call_user_func($this->_getCallback('output'), $this->_imageHandler, $fileName, 90);
    } else {
        call_user_func($this->_getCallback('output'), $this->_imageHandler, $fileName);

    In the code above, I opted for 90, but you can change this value from 0 to 100 quality.

  3. Step 3

    Finally, don't forget to empty the cache of images via System> Cache Management.

That was simple, effective and reusable on any project, from the time you work with images in jpeg format (which format most common with digital photography) and your server supports GD2.

Original post published on

Most visited products - Issue with performance and flat catalog

Ever had a need to get most visited products within Magento? Most solutions and modules available use a reports/product_collection and addViewsCount() method for this. It does the job until you are in a need for performance or want to enable flat product's catalog.

Flat catalog issue

What is the problem? Using flat catalog change the way how products are read from database. Instead of complex query with multiple joins to get attributes of product build in EAV model, with flat catalog you need no joins to get attributes. And you do query different table.

While for Mage_Catalog catalog/product_collection it was kept in mind that there exists option of flat catalog for Magento, it was forgotten within Mage::getModel('reports/product_collection')->addViewsCount() method. Though you can fix this issue quite easily rewriting collection class, I would like recommend something different.

Performance problem

Mage_Report to get for you most visited products counts occurences for product view event within report_event table. This table is used to store 6 types of events and grows in size very fast. Having a website with 1000 views per hour, report_event table with 400 000 of records, any query using this table was a site performance killer.


The true problem of using reports/product_collection for calculating visit counts per product is that we mix up two things - events and reports. Solution for this I have chosen is creation of another table with calculated values of views for each product, table which is populated with cron based task making calculation using report_events, and packing this all together into one module.

Turning on modification of products views calculation and flat product catalog in this particular case decreased load of database 10 times.

Ez Gento: Managing eZ Publish content from Magento admin interface

While implementing eZ Publish CMS with Magento we had to remember, that customers, which are used to Magento admin interface wouldn't like to learn completely new interface just to manage content pages. This requirement has led to creation of additional Magento module, that communicates with eZ Publish in different way than on front end. It's required because eZ Publish is using drafts for content edition, so if we want to keep all original functionality, we have to use drafts also with our Magento site.

Next, as we wanted to include eZ Publish backoffice into Magento admin interface, we had to prepare set of modified eZ Publish backoffice templates, so general design would be a little bit similar to the one in Magento. At the end we got the following result:


Left block contains almost original eZ Publish content tree menu. It's an Ajax driven structure, that loads sub nodes on demand. It's possible to browse whole content structure, separated into two parts for easier maintenance: content and media library.
Main part is also typical for eZ Publish backoffice, it contains current node preview box and list of sub items, with additional function buttons. From here we can create, edit, remove, sort or move selected nodes. It also allows to create/edit it any defined content language.

In page edition view, eZ-Gento gives all functionality existing in eZ Publish, including OnlineEditor (adapted TinyMCE editor for eZ Publish purpose). This allows to insert into page content any object from media library, or direct upload of image, movie or document to download.

There are some more possibilities, like described in previous posts. For example we can manage different page versions, we can set page temporary as hidden, so nobody except admin will see it. There is also Trash, which stores all deleted pages as long as it's not cleaned, so it's possible to restore deleted page.

Stay tuned for more information of eZ-Gento!

Magento Events

When it comes to extending Magento core functionality you have two options - override core classes or use event-driven architecture. The major disadvantage of first is that you can override class only once, so if you want to override it in multiple modules you are soon going to find yourself in dependency hell. Event-driven architecture allows you to keep loose coupling without losing the flexibility of extending Magento modules.

When you want to use Magento event-driven architecture you must know basically two things - how to dispatch an event and how to catch it.

Dispatching events

Within Magento you can dispatch an event as simple as by calling Mage::dispatchEvent(...) method, for example:

 Mage::dispatchEvent('custom_event', array('object'=>$this));

This methods accepts two parameters - event unique identifier and associative array of data that is set as Varien_Event_Observer object data, so in fact passed to event observers.

Catching events

Catching events is a little bit more complex than dispatching. You have to use existing custom module or create a new one. In the minimal case the module file tree should look like this:


Within config.xml you have to add a definition of event observer. Which of the main config.xml file sections (frontend, adminhtml) should contain this definition depends on the scope you want your observer to work on. Here is the example of definition:

      <custom_event> <!-- identifier of the event we want to catch -->
          <custom_event_handler> <!-- identifier of the event handler -->
            <type>model</type> <!-- class method call type; valid are model, object and singleton -->
            <class>baobazacustommodule/observer</class> <!-- observers class alias -->
            <method>customObserverAction</method>  <!-- observer's method to be called -->
            <args></args> <!-- additional arguments passed to observer -->

Xml above should be self-explanatory. I will just explain that for type model and object are equal in behavior and mean that object of a class will be instantiated using Mage::getModel(...) method, and singleton means it will be instantiated using Mage::getSingleton(...) method.

Observer.php file should contain relevant observer class. There is no interface nor need to extend any class for observer classes. The method though should accept one parameter which is the object of Varien_Event_Observer class. This object is the link between dispatcher and event handler. It inherits from Varien_Object so has all required getters handled magically. For example:

class Baobaz_ACustomModule_Model_Observer
  public function customObserverAction(Varien_Event_Observer $observer)
    $object = $observer->getEvent()->getObject(); // we are taking the item with 'object' key from array passed to dispatcher

    return $this;

Default events

Magento implements lot of events. You can find list of them here. What you may miss reading this list, and what was spotted on MageDev blog, Mage_Core_Model_Abstract by default dispatch some special events. Those are:

event identifier event parameters
model_save_before 'object'=>$this
{_eventPrefix}_save_before {_eventObject}=>$this
model_save_after 'object'=>$this
{_eventPrefix}_save_after {_eventObject}=>$this
model_delete_before 'object'=>$this
{_eventPrefix}_delete_before {_eventObject}=>$this
model_delete_after 'object'=>$this
{_eventPrefix}_delete_after {_eventObject}=>$this
model_load_after 'object'=>$this
{_eventPrefix}_load_after {_eventObject}=>$this


{_eventPrefix} means the value of $_eventPrefix variable and {_eventObject} means the value of $_eventObject variable. All classes inheriting from Mage_Core_Model_Abstract should override these variables to create specific events being dispatched. For example for catalog cagetory these variables take following values: $_eventPrefix = 'catalog_category';  $_eventObject = 'category';