Generating and Maintaining Hierarchical Paths using Pathauto in Drupal
November 04, 2014

Introduction

In a previous blog post, we explained what hierarchical paths are, gave examples, and solved a problem with the pathauto module of Drupal 7 that caused certain links to contain a buggy "front" slug.

In this blog post, we will go deeper into all the issues that come with implementing a proper URL paths hierarchy in Drupal 7 and try to propose working solutions where possible.

Why Implement Hierarchical Paths in the First Place?

If you have a website where editors can create pages and sub-pages, you will need to automatically output hierarchical paths if you are to implement a solid website. Let's take this collection of hierarchical, or tree-like, list of links:

http://example.com/about
http://example.com/about/mission
http://example.com/about/vision
http://example.com/about/history
http://example.com/about/history/old
http://example.com/about/history/new

A quick look at each link tells the user where the page is located in the structure of your website. For example, the second link directly tells us we're looking at the "mission" page which falls under the "about" page.

On top of that, when your website's links are properly organized into a hierarchy, you can use the Easy Breadcrumbs Drupal Module to, well, easily generate breadcrumbs, almost as simply as plugging the module and enabling it. Check this article on Wikipedia for more info about breadcrumbs, which might be more than necessary for websites with intricate navigational structures.

Requirements for Implementing Hierarchical Paths

First of all, you need the pathauto module. Second, you need a standard Drupal menu where your pages can be listed and organized into parent and child menu items. The menu is the one actually providing the hierarchy. Pathauto merely takes that tree-like hierarchy and tries, not very well as we'll see later, to generate a proper hierarchical path from it.

For pathauto to do its magic, you'll need to define a "pattern" for your page's paths. Let us spare you the research and assure you this is one of the best patterns we could find, and believe us, we've looked a lot. Of course, modify it according to your needs.

[node:menu-link:parent:url:path]/[node:title]

Basically, this pattern has the following special properties:

  1. If a node is not part of the menu tree, pathauto will then simply generate a path based on the title of the node. Super useful. This is something most other patterns fail at and your node ends up with no custom path at all if it's not in a menu.
  2. If the node is in the menu, and is on the top most level, this also defaults to its title. Exactly what we want.
  3. Finally, if the node is in a menu, and it has a parent, this pattern will correctly prepend the path of the parent menu item to the beginning of the path of the current item. Also, this will work recursively so menu items on deeper levels in the hierarchy will also get a proper path.

Unfortunately, this comes with certain limitations, which we believe would have been better addresses in the pathauto module itself. However, if you need to have this functionality properly implemented now, you need to write custom code as we detail below to handle edge cases.

In the future, we plan on creating a Drupal contrib module to address this issue as a whole or maybe even try, if possible, to push this functionality into the pathauto module.

Limitations and Solutions Where Possible

The first limitation is that a node with sub-menu items can be deleted by website editors, leaving all nodes below it in the menu with incorrect paths. Our solution here is simply to not allow editors to delete such nodes. We have previously detailed this approach in our conditionally preventing node deletion in Drupal tutorial.

The second limitation or bug is that menu items that are under the front page menu item will get a "front" slug added to the beginning or their URLs. This is definitely not desirable and we detail a fix for this issue in our blog post about removing the front slug from hierarchical paths generated by pathauto.

The third limitation is that when the path of the parent menu item changes, its child menu items don't get updated and hence their paths become incorrect. In our case, simply changing the title of the parent node will cause a change in its path and cause this limitation. We will present custom code that fixes this issue in the last section of this post.

Finally, there are more limitations that we couldn't address due to them being very specific cases or mostly not very essential if your website's permissions are properly configured. We might update this post in the future and tackle them, but for now, here's a listing of those extra issues.

  • Paths generated by the views module aren't affected by our code. Luckily, only technical administrators add, edit, and delete views on our websites and they know how to handle path properly and manually.
  • Rearranging the menu items from the Manage Menu admin interface is not handled by our code. We probably need to find the hooks used for the Manage Menu interface and implement our approach on those hooks. For now, we don't need this as our website editors don't have access to this admin interface.
  • Similar to the previous point, if editors have access to the Mange Menu interface, they can create "recursive" or "infinite-looping" menus by creating menu items, without an actual page, that point to a parent page while including this menu item as a child of this parent. Very specific, but it might happen. Luckily, similar to the previous point, we don't need a solution for this case now.

Automatically Updating Child Menu Items Paths on Parent Path Change

In this section, we will solve the third limitation as presented above. Before you start, make sure you know how to create a custom module in Drupal 7.

In addition to the custom code you got from our previous blog posts tackling the first two limitations, this one and this one basically, use the below code to complete automatically update child menu items paths when their parent path changes; which properly solves the third limitation previously presented.

As usual, read the comments in the code below for more details.

Happy coding.

 


<?php

/**
 * This global variable holds node ids so that they are updated after the node
 * and its path are saved. This cannot be done directly inside hook_path_update
 * as when you call pathauto to update child entries their parent will not be
 * saved, at that point in time, to the database and hence they will not get
 * the updated URL slug of that parent.
 * 
 * We hold the variables inside this global var until hook_exit is called,
 * which, luckily, is called after the node is saved.
 * 
 * @global array $GLOBALS['sk_node_ids_to_update']
 * @name $sk_node_ids_to_update 
 */
$GLOBALS['sk_node_ids_to_update'] = array();



/**
 * Implements hook_path_update so that we update child menu items paths
 * whenever a parent item on the main menu has its path updated.
 * 
 * Check notes on $GLOBALS['sk_node_ids_to_update'].
 * 
 * @param array $upi
 */
function sk_path_update($upi) {   
    //Init path info
    //$nodep example: node/38
    //$oldp example: about-title-17
    //$newp example: about-title-18
    $nodep = (!empty($upi['source'])) ? $upi['source'] : '';
    $oldp = (!empty($upi['original']['alias'])) ? $upi['original']['alias'] : '';
    $newp = (!empty($upi['alias'])) ? $upi['alias'] : '';
    
    //If old and new are the same, then nothing changed, stop here
    if($oldp == $newp) {
        return;
    }
    
    //If node path is empty, then something is seriously wrong and we can't get 
    //a node id to use it to start getting the menu paths. Stop then to be
    //on the safe side. Basically, this shouldn't happen, but be safe.
    if(empty($nodep)) {
        return;
    }
    
    //Get all child menu items node ids using our special function
    $nids_to_update = sk_get_all_menu_node_children_ids($nodep);
    
    //Add node ids to be updated later on when our module exists while making
    //sure that we add unique values only as hook_path_update will be called
    //multiple times and it will process the same child items more than once.
    if(!empty($nids_to_update)) {
        foreach($nids_to_update as $nid_to_update) {
            if(!in_array($nid_to_update, $GLOBALS['sk_node_ids_to_update'])) {
                $GLOBALS['sk_node_ids_to_update'][] = $nid_to_update;
            }
        }
    }
}



/**
 * Implements hook_exit so that we update child menu items paths after
 * the parent item is committed (saved) to the database.
 * 
 * Check notes on $GLOBALS['sk_node_ids_to_update'].
 * 
 * Check the documentation of hook_exit as it seems you're not allowed to
 * output messages even to drupal messages in here. Only rely on watchdog.
 * 
 * @param string $destination
 */
function sk_exit($destination = null) {
    //If we don't have an array ready for us with some nids inside it, then
    //we have nothing to do and just use 'return' to exit from the hook.
    if(empty($GLOBALS['sk_node_ids_to_update'])) {
        return;
    }
    if(!is_array($GLOBALS['sk_node_ids_to_update'])) {
        return;
    }
    if(count($GLOBALS['sk_node_ids_to_update']) == 0) {
        return;
    }
    
    //Init by copying the global array so that we empty it next
    $nids_to_update = $GLOBALS['sk_node_ids_to_update'];

    //Empty the global array so that subsequent calls to exit don't update again
    $GLOBALS['sk_node_ids_to_update'] = array();

    //Paths update using the multi update method of the pathauto module
    pathauto_node_update_alias_multiple($nids_to_update, 'update');

    //Log to watch dog as info
    $updated_nodes = node_load_multiple($nids_to_update);
    if(!empty($updated_nodes)) {
        $t = '';
        $t .= '%count child menu items were updated to reflect ';
        $t .= 'the URL path change of the currently updated node.';
        $msg = '';
        $updated_nodes_count = count($updated_nodes);
        $msg .= '<p>' . t($t, array('%count' => $updated_nodes_count)) . '</p>';
        $msg .= '<table>';
        $msg .= '<tr><th>#</th><th>Title</th><th>URL</th></tr>';
        $c = 1;
        foreach($updated_nodes as $updated_node) {
            if(!empty($updated_node->nid)) {
                $url = url('node/' . $updated_node->nid);
                $title = $updated_node->title;
                $msg .= ($c % 2 == 0) ? '<tr class="even">' : '<tr class="odd">';
                $msg .= '<td>' . $c++ . '</td>';
                $msg .= '<td><a href="' . $url . '">' . $title . '</a></td>';
                $msg .= '<td><a href="' . $url . '">' . $url . '</a></td>';
                $msg .= '</tr>';
            }
        }
        $msg .= '</table>';
        watchdog('sk', $msg, array(), WATCHDOG_NOTICE);
    }
}


Drupal Development Open Source SEO

Share this post


Written by
Mario Awad

Founder of SOFTKUBE, lead developer, and getting things done addict. Passionate about open source, user interface design, business development, and the tech world.

More about Mario Awad


About
SOFTKUBE

A small team of experts developing simple, usable, and high-quality web solutions. We blog about business, entrepreneurship, web development, and technology.

More about us


Recent Posts

Gaining Access to a Legacy Google Apps Account When Phone Verification Fails

Custom Theme Migration from Drupal 9 to Drupal 10

How to Change the Most Recent Git Commit Message

How to Make Google Chrome Forget a Permanent HTTP 301 Redirect

View all posts


All Posts Categories

Business Cheat Sheets CLI Design Development Downloads Drupal Email Google Apps HID Keyboards Multilingualism Open Source Philosophy PHP Pointing Devices Productivity Quotes Science Security SEO Technology Thoughts Windows Zend Framework