[WCP] Accelerator for archives browsing

Discuss and announce Total Commander plugins, addons and other useful tools here, both their usage and their development.

Moderators: white, Hacker, petermad, Stefan2

remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

[WCP] Accelerator for archives browsing

Post by *remittor »

WCP - WCPatcher for speed up archives browsing

I don’t know why, but TotalCmd still uses algorithms that are suitable only for proof-of-concept versions of applications. Using the simplest algorithms in 2019 looks very strange.
And the strangest thing is that the TC-author does not want to improve the algorithms implemented in earlier versions TotalCmd.

Earlier, I already suggested that the TC-author improve the algorithm for displaying the contents of archives.

However, it became interesting for me to learn how TotalCmd will behave with proper implementation of file storage.
To do this, I decided to write a WCP-plugin that hook 3 functions in TotalCmd and patch a pointer to file collection.

Now let's try to test the WCP-plugin for speed.

Test bench: Intel J1900 @ 2.4GHz, DDR3 16GiB, SSD 860EVO 512GB, Win7 SP1 x64, TotalCmd 9.22a 32-bit
Test file: 11GB TAR-archive with android sources and etc., which contains 453973 files and 52930 directories.

First test: file collection patching is disabled (the plugin works in monitoring mode).
Second test: file collection patching is ENABLED.

Code: Select all

  Test 1    |   Test 2    |  items |  comment/directory
----------------------------------------------------------------  
46527.07 ms |  3351.26 ms | 506900 |  file collection building
----------------------------------------------------------------  
 4615.79 ms |     0.12 ms |      1 |  [root dir]
 4952.83 ms |     1.00 ms |     51 |  [AP\kernel\firmware]
 4830.28 ms |     3.45 ms |    130 |  [AP\kernel\kernel]
 5653.89 ms |     5.08 ms |    188 |  [AP\external]
Now calculate the value of the obtained acceleration when viewing the contents of the archive: 5653.89 / 5.08 = 1113

It turns out that the WCP-plugin speeds up work with archives at least 1000 times :!: :!: :!:


Download: wcpatcher_v0.7.zip (with source code)

Attention!!!
When the WCPatcher is activated, glitches are observed after deleting or adding a file to/from the archive!
This issue relevant only for archives that contain more than 3000 elements!
Supported versions of TotalCmd: 9.22a, 9.50b11, 9.50b12, 9.50b12a, 9.50b13
Latest version of imgcfg.ini: https://github.com/remittor/wcpatcher/blob/dev/out/imgcfg.ini
Notes:
  • The WCP-plugin activates file collection patching for archives that contain more than 3000 elements.
  • To force switch ON the patching, add the substring "TURN+ON+WCP" in the archive file name.
  • To force switch OFF the patching, add the substring "TURN+OFF+WCP" in the archive file name.
  • To view logs, use DbgView.
Installation:
Installation of WCP-plugin is usual for content plugins.
Next, in order to load plugin at Total Commander start, you should create the fictitious color scheme:
1. On the “Color” page press “Define colors by file type…” button.
2. In the “Define colors by file type” dialog set cursor in any place in the list, and press “Add…” button.
3. Press “Define…” button.
4. In the “Define selection” dialog move to “Plugins” tab.
5. Select in “Plugin” dropdown list “wcpatcher”.
6. Field "Property" can contain any value (even empty).
7. Select in “OP” dropdown list “=” and set in following field any integer, for example “1”.
8. Press “Save” button, give the template some name, for example “WCP”.
9. Next, repeatedly press OK button in all dialogs until options dialog will be closed.
10. Restart TotalCmd.
Last edited by remittor on 2020-01-16, 14:23 UTC, edited 16 times in total.
User avatar
Horst.Epp
Power Member
Power Member
Posts: 6489
Joined: 2003-02-06, 17:36 UTC
Location: Germany

Re: [WCP] Accelerator for archives browsing

Post by *Horst.Epp »

Impressive results
I would like a version for TC 9.50 to test
Windows 11 Home x64 Version 23H2 (OS Build 22631.3447)
TC 11.03 x64 / x86
Everything 1.5.0.1372a (x64), Everything Toolbar 1.3.3, Listary Pro 6.3.0.73
QAP 11.6.3.2 x64
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [WCP] Accelerator for archives browsing

Post by *ghisler(Author) »

It's not clear what exactly you do in your plugin. Do you just replace wcsicmp function? Or create some kind of index or hash function?
Author of Total Commander
https://www.ghisler.com
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

ghisler(Author) wrote: 2019-12-30, 09:36 UTC It's not clear what exactly you do in your plugin. Do you just replace wcsicmp function?
I have not yet replaced the function. But the idea is great! (I will implement this in the next version).
ghisler(Author) wrote: 2019-12-30, 09:36 UTC Or create some kind of index or hash function?
Nope! I will convert your linear list of files into a tree-like collection.
Tree-like structs

Code: Select all

const int max_path_component_len = 255;

const UINT16 EFLAG_DIRECTORY      = 0x0001;
const UINT16 EFLAG_FILE_ADDR      = 0x0002;     /* file elem have addr */
const UINT16 EFLAG_NAME_CASE_SENS = 0x0004;     /* elem name is case sensitive (ala UNIX) */
const UINT16 EFLAG_CONT_CASE_SENS = 0x0008;     /* dir content is case sensitive (ala UNIX) */


#pragma pack(push, 1)

struct TTreeElem;   /* forward declaration */
typedef TTreeElem * PTreeElem;


struct TElemList {
  TTreeElem   * head;
  TTreeElem   * tail;
};
typedef TElemList * PElemList;


struct TTreeNode {
  TElemList     dir;      /* subdir list */
  TElemList     file;     /* file list */

  TElemList * get_list(bool dirlist) { return dirlist ? &dir : &file; }
};
typedef TTreeNode * PTreeNode;


struct TFileAddr {
  UINT64    fileid;
};


struct TTreeElem {
  TTreeElem     * next;         /* NULL for last element */
  TTreeElem     * owner;        /* link to owner dir */
  union {
    TTreeNode     node;         /* only for directories */
    TFileAddr     addr;         /* only for files */
  } content;
  UINT16          flags;
  UINT16          name_pos;     /* name pos in data->name */
  PFileItem       data;
  SIZE_T          name_hash;    /* hash for original name or hash for lower case name (see EFLAG_NAME_CASE_SENS) */
  UINT16          name_len;     /* length of name (number of character without zero-termination) */
  WCHAR           name[1];      /* renaming possible, but required realloc struct for longer names */

  bool is_dir()  { return (flags & EFLAG_DIRECTORY) != 0; }
  bool is_file() { return (flags & EFLAG_DIRECTORY) == 0; }
  bool is_name_case_sens() { return (flags & EFLAG_NAME_CASE_SENS) != 0; }
  bool is_content_case_sens() { return (flags & EFLAG_CONT_CASE_SENS) != 0; }
  LPCWSTR get_data_name() { return data ? (data->name ? data->name + name_pos : NULL) : NULL; }
  TElemList * get_elem_list(bool dirlist) { return is_dir() ? content.node.get_list(dirlist) : NULL; }
  void push_subelem(PTreeElem elem);
  int set_data(PFileItem file_item);
  int set_name(LPCWSTR elem_name, size_t elem_name_len);
};

#pragma pack(pop)
Search func for tree-like

Code: Select all

TTreeElem * FileTree::find_subelem(PTreeElem base, LPCWSTR name, size_t name_len, bool is_dir)
{
  if (!base)
    return NULL;

  if (!base->is_dir())
    return NULL;

  TElemList * elist = base->content.node.get_list(is_dir);
  TTreeElem * elem = elist->head;
  if (!elem)
    return NULL;

  bool const cont_case_sens = base->is_content_case_sens();
  UINT32 const hash_org = get_hash(name, name_len, false);   /* hash for original name */
  UINT32 const hash_lwr = get_hash(name, name_len, true);    /* hash for lower case name */
  size_t const name_size = name_len * sizeof(WCHAR);
  do {
    if (elem->name_len == name_len) {
      if (elem->is_name_case_sens()) {
        if (elem->name_hash != hash_org)
          continue;
        if (memcmp(elem->name, name, name_size) == 0)
          return elem;
      } else {
        if (hash_lwr && elem->name_hash && elem->name_hash != hash_lwr)
          continue;
        if (StrCmpNIW(elem->name, name, (int)name_len) == 0)
          return elem;
      }
    }
  } while(elem = elem->next);

  return NULL;
} 
Last edited by remittor on 2019-12-30, 14:38 UTC, edited 3 times in total.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [WCP] Accelerator for archives browsing

Post by *ghisler(Author) »

I see - this only seems to bring an advantage for archives with many 1000s of folders - it would probably be slower with all files in one folder. For this case of many folders, sorting the archive contents into subfolders from the beginning is certainly a good idea. But it could break so many things (e.g. paths with leading drive letters, branch view) that I don't want to introduce it this late in the beta test. Most users would never encounter such huge archives...
Author of Total Commander
https://www.ghisler.com
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

ghisler(Author) wrote: 2019-12-30, 10:22 UTC I see - this only seems to bring an advantage for archives with many 1000s of folders - it would probably be slower with all files in one folder.
Nope! TREE-like storage is also relevant for archives containing less than 1000 files.

Test bench: Intel J1900 @ 2.4GHz, DDR3 16GiB, SSD 860EVO 512GB, Win7 SP1 x64, TotalCmd 9.22a 32-bit

Test file: "600_files.zip"
Test results for files "600_files__TURN+OFF+WCP.zip" and "600_files__TURN+ON+WCP.zip":

Code: Select all

   WCP OFF  |     WCP ON  |  items |  comment/directory
----------------------------------------------------------------  
    2.83 ms |     2.93 ms |    600 |  linear file collection building
            |     0.47 ms |    600 |  TREE-like file collection building
----------------------------------------------------------------  
   16.73 ms |     0.01 ms |    600 |  enumerate root dir
Got acceleration 16.73 / 0.01 = 1673 times :!: :!: :!:
Test for 90,000 files

Test file: "90,000_files.zip"
Test results for files "90,000_files_TURN+OFF+WCP.zip" and "90,000_files_TURN+ON+WCP.zip":

Code: Select all

   WCP OFF  |     WCP ON  |  items |  comment/directory
----------------------------------------------------------------  
  331.34 ms |   326.01 ms |  90000 |  linear file collection building
            |    49.89 ms |  90000 |  TREE-like file collection building
----------------------------------------------------------------  
 4400.50 ms |     3.11 ms |  90000 |  enumerate root dir
Got acceleration 4400.50 / 3.11 = 1415 times
Last edited by remittor on 2019-12-31, 09:29 UTC, edited 3 times in total.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [WCP] Accelerator for archives browsing

Post by *ghisler(Author) »

This makes no sense, or I don't understand your concept - as I understand it, you only use a tree for the directories, how would this accelerate the loading of a flat directory?
Author of Total Commander
https://www.ghisler.com
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

ghisler(Author) wrote: 2019-12-30, 14:29 UTC This makes no sense, or I don't understand your concept - as I understand it, you only use a tree for the directories...
Nope! TREE-like storage used for all elements (files + dirs).
ghisler(Author) wrote: 2019-12-30, 14:29 UTC ...how would this accelerate the loading of a flat directory?
Linear list loading is not accelerated.
Accelerates the search for items to display in panel.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [WCP] Accelerator for archives browsing

Post by *ghisler(Author) »

Do you also handle the case where an archive doesn't explicitly contain a directory? Total Commander does add directories when packing subdirs, but there are packers which don't do it. For example, an archive can contain
dir1\file1.txt
dir2\file2.txt
dir3\file3.txt
but not dir1, dir2 and dir3. Still when listing the root of the archive, these 3 directories must be shown.
Author of Total Commander
https://www.ghisler.com
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

ghisler(Author) wrote: 2019-12-30, 15:58 UTC ... but not dir1, dir2 and dir3. Still when listing the root of the archive, these 3 directories must be shown.
This version of the archive contents has been considered in advance:
FileTree::add_file_item

Code: Select all

int FileTree::add_file_item(PFileItem fi)
{
  int hr = -1;
  TTreeElem * elem = &m_root;

  FIN_IF(!fi, 0);
  FIN_IF(!fi->name, 0);
  FIN_IF(fi->name[0] == 0, 0);

  size_t nlen = 0;
  LPCWSTR name = fi->name;
  for (LPCWSTR fn = fi->name; /*nothing*/; fn++) {
    if (*fn == L'\\') {
      if (nlen) {
        elem = add_dir(elem, name, nlen);
        FIN_IF(!elem, -10);
      }
      name = fn + 1;
      nlen = 0;
      continue;   /* skip backslash */
    }
    if (*fn == 0) {
      if (nlen) {
        if (fi->attr & tfa::DIRECTORY) {
          elem = add_dir(elem, name, nlen);
          FIN_IF(!elem, -21);
          elem->set_data(fi);
        } else {
          elem = add_file(elem, name, nlen, fi);
          FIN_IF(!elem, -25);
        }
      }
      break;
    }
    nlen++;
  }
  hr = 0;  

fin:
  LOGe_IF(hr, "%s: ERROR = %d", __func__, hr);
  return hr;
}
FileTree::add_dir

Code: Select all

TTreeElem * FileTree::add_dir(PTreeElem owner, LPCWSTR name, size_t name_len)
{
  if (!name_len || name_len > max_path_component_len)
    return NULL;

  TTreeElem * elem = find_subdir(owner, name, name_len);
  if (elem)
    return elem;   /* subdir already exist */

  elem = create_elem(owner, name, name_len, EFLAG_DIRECTORY);
  return elem;
}
PFileItem

Code: Select all

namespace tfa {      /* TotalCmd File Attr */
  enum Type : UCHAR {
    READONLY    = 0x01,
    HIDDEN      = 0x02,
    SYSTEM      = 0x04,
    DEVICE      = 0x08,
    DIRECTORY   = 0x10,
    ARCHIVE     = 0x20,
  };
};

#ifdef _WIN64
#pragma pack(push, 8)
#else
#pragma pack(push, 4)
#endif

struct TFileItem {                   /* see TcCreateFileInfo */
/* x32 x64*/ LPVOID     method_get_nameA; /* addr 0x70915C -> 0x70A0EC */
  /*04 08*/  DWORD      sizeLO;
  /*08 0C*/  DWORD      timeLO;   /* MS FILETIME */
  /*0C 10*/  DWORD      timeHI;   /* MS FILETIME */
  /*10 14*/  DWORD      sizeHI;
  /*14 18*/  int        unk6;     /* -1 */
  /*18 1C*/  DWORD      unk7;     /* 0 */
  /*1C 20*/  int        unk8;     /* -1 */   /* elem[28]  elem[7*4]
  /*20 28*/  LPVOID     ptr1;     /* 0 */
  /*24 30*/  LPVOID     ptr2;     /* 0 */
  /*28 38*/  LPSTR      str1;
  /*2C 40*/  SIZE_T     index;
#ifndef _WIN64
  /*30   */  DWORD      unk12;
#endif
  /*34 48*/  tfa::Type  attr;
  /*35 49*/  BYTE       unkVV;    /* 0 */
  /*36 4A*/  UINT16     auxAttr;
  /*38 4C*/  DWORD      unk14;    /* elem[4144] */
  /*3C 50*/  UINT16     unk15;    /* 0 */
  /*40 58*/  LPVOID     ptr3;     /* 0 */
  /*44 60*/  LPCWSTR    name;     /* wchar_t[1024] */
  /*48 68*/  LPVOID     ptr4;     /* 0 */

  INT64 get_size() { return ((INT64)sizeHI << 32) | sizeLO; }
  INT64 get_time() { return ((INT64)timeHI << 32) | timeLO; }
};

typedef  TFileItem * PFileItem;
Last edited by remittor on 2019-12-31, 09:28 UTC, edited 2 times in total.
User avatar
ghisler(Author)
Site Admin
Site Admin
Posts: 48083
Joined: 2003-02-04, 09:46 UTC
Location: Switzerland
Contact:

Re: [WCP] Accelerator for archives browsing

Post by *ghisler(Author) »

I see - so the simplest solution for me would be to just have one linear list per subdirectory, where the directory entries point to lists of the contained files/dirs. Then use a hash in these lists to find the correct subdir more quickly.
Author of Total Commander
https://www.ghisler.com
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

ghisler(Author) wrote: 2019-12-30, 09:36 UTC Do you just replace wcsicmp function?
On your advice, I decided to patch function tc_wcsicmp:
tc_wcsicmp with using shlwapi.dll@StrCmpIW

Code: Select all

//int __usercall tc_wcsicmp@<eax>(WCHAR *wstr1@<eax>, WCHAR *wstr2@<edx>)

size_t __fastcall tc_wcsicmp(const func_hook & fh, tramp_data & td)
{
#ifdef _WIN64
  LPCWSTR const wstr1 = (LPCWSTR)td.get_reg(xreg::cx);
  LPCWSTR const wstr2 = (LPCWSTR)td.get_reg(xreg::dx);
#else
  LPCWSTR const wstr1 = (LPCWSTR)td.get_reg(xreg::ax);
  LPCWSTR const wstr2 = (LPCWSTR)td.get_reg(xreg::dx);
#endif
  int const res = StrCmpIW(wstr1, wstr2);
  if (res < 0)
    return (size_t)(-1);
  if (res > 0)
    return 1;
  return 0;
}
Original function tc_wcsicmp

Code: Select all

int __usercall tc_wcsicmp@<eax>(WCHAR * wstr1@<eax>, WCHAR * wstr2@<edx>)
{
  int result;
  CHAR tmp2[260];
  CHAR tmp1[260];
  WCHAR wtmp2[1024];
  WCHAR wtmp1[1024];

  if ( win_dwPlatformId == 2 )  // Windows NT
  {
    wcsncpyEx(wtmp1, 1023, wstr1);
    wcsncpyEx(wtmp2, 1023, wstr2);
    wstr_lower(wtmp1);
    wstr_lower(wtmp2);
    result = wcscmp(wtmp1, wtmp2);
  }
  else
  {
    WideCharToMultiByte(0, 0, wstr1, -1, tmp1, 259, 0, 0);
    WideCharToMultiByte(0, 0, wstr2, -1, tmp2, 259, 0, 0);
    str_lower(tmp1);
    str_lower(tmp2);
    result = strcmp(tmp1, tmp2);
  }
  return result;
}
After implementing the patch for tc_wcsicmp in WCP-plugin, I ran performance tests.

Test bench: Intel J1900 @ 2.4GHz, DDR3 16GiB, SSD 860EVO 512GB, Win7 SP1 x64, TotalCmd 9.22a 32-bit
Test file: 11GB TAR-archive with android sources and etc., which contains 453973 files and 52930 directories.
Name of file: "android_src__TURN+OFF+WCP.tar" (to disable file collection patching).

Code: Select all

  orig func |    StrCmpIW |  items |  comment/directory
----------------------------------------------------------------  
45626.70 ms | 44141.73 ms | 506900 |  linear file collection building
----------------------------------------------------------------  
 4642.68 ms |  2807.04 ms |      1 |  [root dir] (contain only dir "AP")
 4818.90 ms |  2283.42 ms |    130 |  [AP\kernel\kernel]
 5619.18 ms |  3443.78 ms |    188 |  [AP\external]
Performance increased by: (1 - 3443/5619)*100% = 38%
same test on 64-bit TotalCmd 9.22a

Code: Select all

  orig func |    StrCmpIW |  items |  comment/directory
----------------------------------------------------------------  
53812.94 ms | 54018.25 ms | 506900 |  linear file collection building
----------------------------------------------------------------  
 7001.65 ms |  5203.97 ms |      1 |  [root dir] (contain only dir "AP")
 6187.24 ms |  4090.68 ms |    188 |  [AP\external]
Performance increased by: (1 - 4090/6187)*100% = 33%
But there are still functions that can be optimized...
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

Update WCP-plugin:
  • It is possible to change options via INI-file.
  • It is possible to change logging level.
  • It is possible to disable patching for function tc_wcsicmp.
  • Project published on Github.
Download: wcpatcher_v0.5.zip
remittor
Junior Member
Junior Member
Posts: 49
Joined: 2019-10-02, 07:18 UTC

Re: [WCP] Accelerator for archives browsing

Post by *remittor »

Update WCP-plugin:
  • Add support 9.50b11
  • Adding new versions of TotalCmd is implemented through the INI-file.
Download: wcpatcher_v0.6.zip
User avatar
Ovg
Power Member
Power Member
Posts: 756
Joined: 2014-01-06, 16:26 UTC

Re: [WCP] Accelerator for archives browsing

Post by *Ovg »

There isn't property wcpatcher in dropdown list ...
https://yadi.sk/i/v4ZsvAINXubATw
It's impossible to lead us astray for we don't care even to choose the way.
#259941, TC 11.01 x64, Windows 7 SP1 x64
Post Reply