{"id":27624248,"url":"https://github.com/fapulito/jsoncons","last_synced_at":"2026-03-11T14:37:22.405Z","repository":{"id":288721801,"uuid":"968997333","full_name":"fapulito/jsoncons","owner":"fapulito","description":"JSON Console Utility OSS Python Package | COBOL-to-JSON Features | Pretty-Print Dictionaries to JSON","archived":false,"fork":false,"pushed_at":"2025-04-25T18:50:16.000Z","size":42,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-25T07:44:02.141Z","etag":null,"topics":["cli","cobol","cobol-to-json","console","dictionary-python","interoperability","json","legacy-code","pretty-print","scripting"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/jsoncons/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fapulito.png","metadata":{"files":{"readme":"README-fib.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY_AUDIT_v1.1.0.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-04-19T06:34:31.000Z","updated_at":"2025-04-21T08:17:57.000Z","dependencies_parsed_at":"2025-04-23T11:42:52.590Z","dependency_job_id":null,"html_url":"https://github.com/fapulito/jsoncons","commit_stats":null,"previous_names":["fapulito/jsoncons"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/fapulito/jsoncons","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fapulito%2Fjsoncons","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fapulito%2Fjsoncons/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fapulito%2Fjsoncons/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fapulito%2Fjsoncons/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fapulito","download_url":"https://codeload.github.com/fapulito/jsoncons/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fapulito%2Fjsoncons/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30384088,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-11T14:10:17.325Z","status":"ssl_error","status_checked_at":"2026-03-11T14:09:37.934Z","response_time":84,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","cobol","cobol-to-json","console","dictionary-python","interoperability","json","legacy-code","pretty-print","scripting"],"created_at":"2025-04-23T11:25:04.888Z","updated_at":"2026-03-11T14:37:22.392Z","avatar_url":"https://github.com/fapulito.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Include Fibonacci Hashing in jsoncons package\n## Extending the `jsoncons` tool and demonstrate the concepts of Fibonacci hashing.\n\n### **1. Understanding the Application of Fibonacci Hashing Here**\n\nAs detailed in the provided `fibhash-brief.txt` and the image, Fibonacci hashing (a type of multiplicative hashing) is primarily used to **map a pre-computed hash value (often 64-bit) to a smaller range, typically the index (slot) within a hash table, especially one with a power-of-two size.**\n\nIt excels at:\n*   **Speed:** Faster than integer modulo for arbitrary table sizes. Comparable speed to bitwise AND for power-of-two sizes.\n*   **Distribution:** Effectively mixes the bits of the input hash, spreading consecutive or patterned inputs more evenly across the target range compared to simple modulo or just taking lower bits (bitwise AND).\n\n**Crucially:** Fibonacci hashing *doesn't* inherently speed up the process of *parsing* fixed-width strings (like COBOL data) or *formatting* JSON. Its benefit lies in the hash-to-index mapping step *within* a hash table implementation.\n\nTherefore, we will:\n1.  Create the requested `_fib` functions in `cli.py` as variants of the originals. For the COBOL functions, they will perform the same parsing logic. For `process_json_fib`, it will be an alias. This fulfills the structural requirement.\n2.  Add corresponding commands to the CLI.\n3.  Write tests for these new commands, ensuring they produce the correct *output* (identical to the originals in this case).\n4.  Create a Jupyter Notebook that *demonstrates* the *actual* principle and potential speedup of Fibonacci hashing in a relevant context (mapping hash values to indices), using the data generated by `jsoncons` as input for the demonstration.\n\n### **2. Updated `cli.py`**\n\n```python\n# jsoncons/cli.py - v0.3.1 (adding fib variants)\nimport json\nimport sys\nimport argparse\nimport os\nimport logging\nimport decimal\nimport math # Added for potential future use, though not directly in parsing\n\n# (Add this near the top of cli.py or in a separate helper file)\nclass CobolParsingError(ValueError):\n    \"\"\"Custom error for COBOL parsing issues.\"\"\"\n    pass\n\n# --- Fibonacci Hashing Constants (for reference/demonstration) ---\n# From fibhash-brief.txt and image: 2^64 / phi\n# Using the value provided in the text for 64-bit hashing\nFIB_HASH_64_MAGIC = 11400714819323198485\n\ndef fibonacci_hash_to_index(hash_value: int, table_size_power_of_2: int) -\u003e int:\n    \"\"\"\n    Maps a 64-bit hash value to an index for a power-of-2 sized table\n    using Fibonacci hashing.\n\n    Args:\n        hash_value: The input hash value (ideally 64-bit or treated as such).\n        table_size_power_of_2: The size of the hash table (must be a power of 2).\n\n    Returns:\n        An index in the range [0, table_size_power_of_2 - 1].\n\n    Raises:\n        ValueError: If table_size_power_of_2 is not a power of 2.\n    \"\"\"\n    if table_size_power_of_2 \u003c= 0 or (table_size_power_of_2 \u0026 (table_size_power_of_2 - 1)) != 0:\n        raise ValueError(\"table_size_power_of_2 must be a positive power of 2.\")\n\n    # Ensure we are working with 64-bit unsigned semantics for the multiplication\n    hash_value \u0026= 0xFFFFFFFFFFFFFFFF # Mask to 64 bits\n    magic_product = (hash_value * FIB_HASH_64_MAGIC) \u0026 0xFFFFFFFFFFFFFFFF # Multiply and wrap around 64 bits\n\n    # Determine the shift amount\n    # We want log2(table_size) bits from the top\n    shift_amount = 64 - table_size_power_of_2.bit_length() + 1\n\n    # Shift to get the top bits\n    return magic_product \u003e\u003e shift_amount\n\n# --- Original COBOL Parsing Logic ---\ndef parse_cobol_line(line, layout_config, line_num):\n    \"\"\"Parses a single line of fixed-width data based on the layout.\"\"\"\n    record = {}\n    expected_len = layout_config.get(\"record_length\")\n\n    # Calculate the length *after* stripping newlines/carriage returns\n    actual_stripped_length = len(line.rstrip('\\n\\r')) # \u003c\u003c\u003c Calculate length here\n\n    # Now check against expected length\n    if expected_len and actual_stripped_length != expected_len:\n        # Use the calculated variable inside the f-string\n        logging.warning(f\"Line {line_num}: Expected length {expected_len}, got {actual_stripped_length}. Processing anyway.\") # \u003c\u003c\u003c Fixed f-string\n        # Decide if you want to raise an error here instead:\n        # raise CobolParsingError(f\"Line {line_num}: Expected length {expected_len}, got {actual_stripped_length}\")\n\n    for field in layout_config.get(\"fields\", []):\n        name = field[\"name\"]\n        # Adjust start_pos to be 0-based index for Python slicing\n        start_index = field[\"start_pos\"] - 1\n        length = field[\"length\"]\n        end_index = start_index + length\n        cobol_type = field.get(\"type\", \"PIC X\").upper()\n        should_strip = field.get(\"strip\", False)\n        implied_decimals = field.get(\"decimals\", 0)\n        is_signed = field.get(\"signed\", False)\n\n        # Slice the data, handle potential short lines gracefully\n        raw_value = line[start_index:end_index] if start_index \u003c len(line) else \"\"\n        # Pad if the slice was shorter than expected (due to short line)\n        raw_value = raw_value.ljust(length)\n\n        processed_value = None\n\n        try:\n            if cobol_type == \"PIC X\":\n                processed_value = raw_value\n                if should_strip:\n                    processed_value = processed_value.strip()\n            elif cobol_type == \"PIC 9\":\n                if not raw_value.strip(): # Handle empty numeric fields\n                    processed_value = None\n                elif implied_decimals \u003e 0:\n                    # Insert decimal point\n                    # Ensure we have enough digits before slicing\n                    if len(raw_value) \u003e implied_decimals:\n                         decimal_str = raw_value[:-implied_decimals] + \".\" + raw_value[-implied_decimals:]\n                    else: # Handle cases like '50' for PIC 9(0)V99 -\u003e 0.50\n                        decimal_str = \"0.\" + raw_value.zfill(implied_decimals)\n                    processed_value = decimal.Decimal(decimal_str)\n                else:\n                    processed_value = int(raw_value)\n            elif cobol_type == \"PIC S9\":\n                 if not raw_value.strip(): # Handle empty numeric fields\n                    processed_value = None\n                 else:\n                    num_str = raw_value\n                    sign = '+' # Default sign\n                    # Basic sign handling (assumes sign is overpunched or leading/trailing)\n                    # This is a SIMPLIFICATION. Real COBOL has many sign conventions.\n                    # Assuming trailing sign for simplicity here.\n                    if is_signed:\n                        # Example: Very simple trailing sign check - adjust as needed\n                        if raw_value.endswith(('-', '}', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R')):\n                             sign = '-'\n                             num_str = raw_value[:-1] # Remove sign character if trailing\n                        elif raw_value.endswith(('+', '{', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I')):\n                             sign = '+'\n                             num_str = raw_value[:-1] # Remove sign character if trailing\n                        # Add leading sign checks if necessary\n                        elif raw_value.startswith('-'):\n                            sign = '-'\n                            num_str = raw_value[1:]\n                        elif raw_value.startswith('+'):\n                            sign = '+'\n                            num_str = raw_value[1:]\n                        # Add other sign conventions (overpunching middle digits?) if needed\n\n                    # Ensure num_str only contains digits at this point if needed\n                    num_str = ''.join(filter(str.isdigit, num_str))\n                    if not num_str: # If only sign was present\n                        raise ValueError(\"Numeric string is empty after sign processing\")\n\n\n                    if implied_decimals \u003e 0:\n                         # Ensure num_str has enough digits before inserting decimal\n                        if len(num_str) \u003e= implied_decimals:\n                             decimal_str = num_str[:-implied_decimals] + \".\" + num_str[-implied_decimals:]\n                        else: # Handle cases like '50' for PIC S9(2)V99 -\u003e 0.50\n                            decimal_str = \"0.\" + num_str.zfill(implied_decimals)\n\n                        # Combine sign and number\n                        # Let Decimal handle the sign placement\n                        processed_value = decimal.Decimal(sign + decimal_str)\n\n                    else:\n                         # Combine sign and number for integer\n                         processed_value = int(sign + num_str)\n\n            # Add more types here if needed (COMP-3 is complex)\n\n            else:\n                 logging.warning(f\"Line {line_num}, Field '{name}': Unsupported COBOL type '{cobol_type}'. Treating as string.\")\n                 processed_value = raw_value\n                 if should_strip:\n                     processed_value = processed_value.strip()\n\n            record[name] = processed_value\n\n        except (ValueError, decimal.InvalidOperation, IndexError) as e:\n             raise CobolParsingError(\n                 f\"Line {line_num}, Field '{name}': Error converting value '{raw_value}' \"\n                 f\"using type '{cobol_type}' (Decimals: {implied_decimals}, Signed: {is_signed}). Original error: {e}\"\n             ) from e\n\n    return record\n\n# --- NEW Fibonacci Variant of COBOL Parsing ---\n# NOTE: For demonstration, this is functionally identical to parse_cobol_line.\n# Fibonacci hashing is not applied *during* parsing itself.\ndef parse_cobol_line_fib(line, layout_config, line_num):\n    \"\"\"\n    Parses a single line of fixed-width data based on the layout.\n    (Fibonacci variant - functionally identical to parse_cobol_line for parsing,\n    intended for use in workflows demonstrating Fibonacci hashing elsewhere).\n    \"\"\"\n    # The actual parsing logic is the same.\n    # We are creating this function to fulfill the naming requirement and\n    # allow a separate CLI command. The *demonstration* of Fibonacci hashing\n    # will happen externally (e.g., in the notebook).\n    return parse_cobol_line(line, layout_config, line_num) # Reuse the original logic\n\n\n# --- Original COBOL to JSON Processor ---\ndef process_cobol_to_json(layout_file, infile, outfile):\n    \"\"\"Loads layout, reads COBOL data, parses lines, and writes JSON.\"\"\"\n    try:\n        with open(layout_file, 'r', encoding='utf-8') as f_layout:\n            layout_config = json.load(f_layout)\n    except FileNotFoundError:\n        logging.error(f\"Layout file not found: {layout_file}\")\n        sys.exit(1)\n    except json.JSONDecodeError as e:\n        logging.error(f\"Error decoding JSON layout file '{layout_file}': {e}\")\n        sys.exit(1)\n    except Exception as e:\n        logging.error(f\"Error reading layout file '{layout_file}': {e}\", exc_info=True)\n        sys.exit(1)\n\n    records = []\n    line_num = 0\n    try:\n        for line in infile:\n            line_num += 1\n            # Skip empty lines\n            if not line.strip():\n                continue\n            try:\n                # Call the ORIGINAL parse function\n                record = parse_cobol_line(line, layout_config, line_num)\n                records.append(record)\n            except CobolParsingError as e:\n                logging.error(str(e)) # Log parsing error for the specific line\n                # Optionally decide whether to skip the line or exit entirely\n                # sys.exit(1) # Uncomment to make it fatal\n                logging.warning(f\"Skipping line {line_num} due to parsing error.\")\n\n        # Use Decimal encoder for output\n        class DecimalEncoder(json.JSONEncoder):\n            def default(self, obj):\n                if isinstance(obj, decimal.Decimal):\n                    # Convert Decimal to string to preserve precision\n                    # Or convert to float: float(obj), but risk precision loss\n                    return str(obj)\n                # Let the base class default method raise the TypeError\n                return super(DecimalEncoder, self).default(obj)\n\n        json.dump(records, outfile, indent=2, cls=DecimalEncoder) # Use indent=2 for pretty print\n        outfile.write('\\n')\n        if outfile is not sys.stdout:\n            logging.info(f\"Successfully converted COBOL data to JSON in {outfile.name}\")\n\n    except FileNotFoundError:\n        # This is already handled by argparse FileType, but as fallback\n        logging.error(f\"Input data file not found: {infile.name}\")\n        sys.exit(1)\n    except Exception as e:\n        logging.error(f\"An error occurred during COBOL data processing: {e}\", exc_info=True)\n        sys.exit(1)\n\n# --- NEW Fibonacci Variant of COBOL to JSON Processor ---\ndef process_cobol_to_json_fib(layout_file, infile, outfile):\n    \"\"\"\n    Loads layout, reads COBOL data, parses lines (using fib variant parser),\n    and writes JSON.\n    (Fibonacci variant - uses parse_cobol_line_fib).\n    \"\"\"\n    try:\n        with open(layout_file, 'r', encoding='utf-8') as f_layout:\n            layout_config = json.load(f_layout)\n    except FileNotFoundError:\n        logging.error(f\"Layout file not found: {layout_file}\")\n        sys.exit(1)\n    except json.JSONDecodeError as e:\n        logging.error(f\"Error decoding JSON layout file '{layout_file}': {e}\")\n        sys.exit(1)\n    except Exception as e:\n        logging.error(f\"Error reading layout file '{layout_file}': {e}\", exc_info=True)\n        sys.exit(1)\n\n    records = []\n    line_num = 0\n    try:\n        for line in infile:\n            line_num += 1\n            # Skip empty lines\n            if not line.strip():\n                continue\n            try:\n                # Call the NEW parse_cobol_line_fib function\n                record = parse_cobol_line_fib(line, layout_config, line_num)\n                records.append(record)\n            except CobolParsingError as e:\n                logging.error(str(e)) # Log parsing error for the specific line\n                logging.warning(f\"Skipping line {line_num} due to parsing error.\")\n\n        # Use Decimal encoder for output (same as original)\n        class DecimalEncoder(json.JSONEncoder):\n            def default(self, obj):\n                if isinstance(obj, decimal.Decimal):\n                    return str(obj)\n                return super(DecimalEncoder, self).default(obj)\n\n        json.dump(records, outfile, indent=2, cls=DecimalEncoder)\n        outfile.write('\\n')\n        if outfile is not sys.stdout:\n            logging.info(f\"Successfully converted COBOL data (fib variant) to JSON in {outfile.name}\")\n\n    except FileNotFoundError:\n        logging.error(f\"Input data file not found: {infile.name}\")\n        sys.exit(1)\n    except Exception as e:\n        logging.error(f\"An error occurred during COBOL data processing (fib variant): {e}\", exc_info=True)\n        sys.exit(1)\n\n\n# --- Original JSON Processing Logic ---\ndef process_json(infile, outfile, indent=2, sort_keys=False):\n    \"\"\"Reads JSON from infile, validates, and writes formatted JSON to outfile.\"\"\"\n    try:\n        # Use Decimal hook for loading if needed, though less common for general JSON\n        # data = json.load(infile, parse_float=decimal.Decimal)\n        data = json.load(infile)\n        output_indent = indent if indent \u003e 0 else None # Handle indent \u003c= 0 for compact\n        json.dump(data, outfile, indent=output_indent, sort_keys=sort_keys, ensure_ascii=False) # Added ensure_ascii=False\n        outfile.write('\\n') # Ensure newline at the end\n    except json.JSONDecodeError as e:\n        input_source = \"stdin\" if infile is sys.stdin else f\"file '{infile.name}'\"\n        print(f\"Error: Invalid JSON input from {input_source} - {e}\", file=sys.stderr)\n        sys.exit(1)\n    except Exception as e:\n        print(f\"An error occurred during JSON processing: {e}\", file=sys.stderr)\n        sys.exit(1)\n\n# --- NEW Fibonacci Variant of JSON Processing ---\n# NOTE: This is functionally identical to process_json.\n# Fibonacci hashing is not relevant to JSON formatting.\ndef process_json_fib(infile, outfile, indent=2, sort_keys=False):\n    \"\"\"\n    Reads JSON from infile, validates, and writes formatted JSON to outfile.\n    (Fibonacci variant - functionally identical to process_json, serves as an alias\n    for potential workflow consistency if needed, but performs no hashing).\n    \"\"\"\n    # Calls the original function - it's just an alias for the command structure\n    process_json(infile, outfile, indent=indent, sort_keys=sort_keys)\n\n\n# --- Main CLI Logic ---\ndef main():\n    # Set up basic logging to stderr\n    logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s', stream=sys.stderr)\n\n    parser = argparse.ArgumentParser(\n        prog=\"jsoncons\",\n        description=\"Validate/format JSON or convert fixed-width COBOL data to JSON.\"\n    )\n\n    # --- Subparsers setup ---\n    subparsers = parser.add_subparsers(\n        title='Available Commands',\n        dest=\"command\",\n        help='Use a command to process data.',\n        required=True\n    )\n\n    # --- Define arguments shared between encode \u0026 decode \u0026 process_json_fib ---\n    common_parser_json = argparse.ArgumentParser(add_help=False)\n    common_parser_json.add_argument(\n        \"infile\", nargs='?', type=argparse.FileType('r', encoding='utf-8'),\n        default=sys.stdin, help=\"Input JSON file (reads from stdin if omitted)\"\n    )\n    common_parser_json.add_argument(\n        \"outfile\", nargs='?', type=argparse.FileType('w', encoding='utf-8'),\n        default=sys.stdout, help=\"Output JSON file (writes to stdout if omitted)\"\n    )\n    common_parser_json.add_argument(\n        \"--indent\", type=int, default=2,\n        help=\"Indentation level for output JSON (use 0 or less for compact, default: 2)\"\n    )\n    common_parser_json.add_argument(\n        \"--sort-keys\", action=\"store_true\", help=\"Sort the keys in the output JSON\"\n    )\n\n    # --- Define arguments shared between cobol_to_json commands ---\n    common_parser_cobol = argparse.ArgumentParser(add_help=False)\n    common_parser_cobol.add_argument(\n        \"--layout-file\",\n        metavar='LAYOUT_JSON',\n        required=True,\n        help=\"Path to the JSON file describing the COBOL record layout.\"\n    )\n    common_parser_cobol.add_argument(\n        \"infile\",\n        # Note: Not defaulting to stdin here, usually COBOL data comes from specific files\n        type=argparse.FileType('r', encoding='utf-8'), # Or 'latin-1' / 'cp037' etc. if EBCDIC\n        help=\"Input fixed-width COBOL data file.\"\n    )\n    common_parser_cobol.add_argument(\n        \"outfile\",\n        nargs='?', # Make output file optional, defaulting to stdout\n        type=argparse.FileType('w', encoding='utf-8'),\n        default=sys.stdout,\n        help=\"Output JSON file (writes to stdout if omitted).\"\n    )\n\n\n    # --- 'encode' Subcommand ---\n    parser_encode = subparsers.add_parser(\n        'encode',\n        help='Validate and pretty-print (encode) JSON data.',\n        parents=[common_parser_json]\n    )\n\n    # --- 'decode' Subcommand ---\n    parser_decode = subparsers.add_parser(\n        'decode',\n        help='Alias for encode. Reads JSON, validates, and outputs formatted JSON.',\n        parents=[common_parser_json]\n    )\n\n    # --- 'cobol_to_json' Subcommand ---\n    parser_c2j = subparsers.add_parser(\n        'cobol_to_json',\n        help='Convert fixed-width COBOL data file to JSON using a layout file.',\n        parents=[common_parser_cobol] # Use common cobol args\n    )\n\n    # --- NEW 'process_json_fib' Subcommand ---\n    parser_pjf = subparsers.add_parser(\n        'process_json_fib',\n        help='Alias for encode/decode (Fibonacci variant placeholder).',\n        parents=[common_parser_json] # Use common json args\n    )\n\n    # --- NEW 'cobol_to_json_fib' Subcommand ---\n    parser_c2j_fib = subparsers.add_parser(\n        'cobol_to_json_fib',\n        help='Convert COBOL to JSON (Fibonacci variant - uses specific parser).',\n        parents=[common_parser_cobol] # Use common cobol args\n    )\n\n\n    # --- Parse Arguments ---\n    try:\n        args = parser.parse_args()\n    except Exception as e:\n        logging.error(f\"Error parsing arguments: {e}\")\n        parser.print_help(sys.stderr)\n        sys.exit(2)\n\n    # --- Execute Logic based on command ---\n\n    # Guard against reading and writing to the same file (can corrupt input)\n    if hasattr(args, 'infile') and hasattr(args, 'outfile'):\n         if (args.infile is not sys.stdin and args.outfile is not sys.stdout and\n             hasattr(args.infile, 'name') and hasattr(args.outfile, 'name') and\n             os.path.abspath(args.infile.name) == os.path.abspath(args.outfile.name)):\n            logging.error(f\"Input file '{args.infile.name}' and output file '{args.outfile.name}' cannot be the same.\")\n            sys.exit(1)\n\n    # Call the appropriate function based on the command\n    if args.command in [\"encode\", \"decode\"]:\n        output_indent = args.indent if args.indent \u003e 0 else None\n        process_json(args.infile, args.outfile, indent=output_indent, sort_keys=args.sort_keys)\n    elif args.command == \"process_json_fib\": # New command\n        output_indent = args.indent if args.indent \u003e 0 else None\n        process_json_fib(args.infile, args.outfile, indent=output_indent, sort_keys=args.sort_keys)\n    elif args.command == \"cobol_to_json\":\n        process_cobol_to_json(args.layout_file, args.infile, args.outfile)\n    elif args.command == \"cobol_to_json_fib\": # New command\n        process_cobol_to_json_fib(args.layout_file, args.infile, args.outfile)\n    else:\n        # Should not happen if subparsers are required=True\n        logging.error(f\"Error: Unknown command '{args.command}' encountered.\")\n        parser.print_help(sys.stderr)\n        sys.exit(1)\n\nif __name__ == \"__main__\":\n    main()\n```\n\n### **3. Updated `test_cli.py`**\n\n```python\n# tests/test_cli.py\n\nimport unittest\nimport sys\nimport io\nimport os\nimport json\nimport tempfile\nimport shutil\nimport decimal\nfrom unittest.mock import patch\n\n# Add the parent directory (project root) to the Python path\n# This allows importing 'jsoncons' even when running tests directly\n# Adjust the path if your structure is different\nsys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))\n\n# Import the main function from the CLI module\nfrom jsoncons import cli # Use the package name defined in setup.py or directory name\n\n# --- Sample COBOL Data and Layout for Testing ---\n\nCOBOL_LAYOUT = {\n    \"record_length\": 80,\n    \"fields\": [\n        {\"name\": \"ID\", \"start_pos\": 1, \"length\": 5, \"type\": \"PIC 9\"},\n        {\"name\": \"NAME\", \"start_pos\": 6, \"length\": 20, \"type\": \"PIC X\", \"strip\": True},\n        {\"name\": \"AMOUNT\", \"start_pos\": 26, \"length\": 7, \"type\": \"PIC S9\", \"decimals\": 2, \"signed\": True},\n        {\"name\": \"STATUS\", \"start_pos\": 33, \"length\": 1, \"type\": \"PIC X\"},\n        {\"name\": \"UNUSED\", \"start_pos\": 34, \"length\": 47, \"type\": \"PIC X\"}\n    ]\n}\n\n# Note: Trailing sign convention { -\u003e +0, A-I -\u003e +1-9, } -\u003e -0, J-R -\u003e -1-9\n# Line 1: ID=12345, NAME=\"TEST USER         \", AMOUNT=+123.45 (1234E = +5), STATUS=A\n# Line 2: ID=00001, NAME=\"ANOTHER ONE       \", AMOUNT=-987.65 (9876N = -5), STATUS=B\n# Line 3: Invalid Amount\n# Line 4: Short Line\nCOBOL_DATA = \"\"\"\\\n12345TEST USER         01234E A                                                \\r\\n\\\n00001ANOTHER ONE       09876N B                                                \\n\\\n54321BAD DATA          BADAMT C                                                \\\n00002SHORT LINE        00050{ D\n\"\"\" # Line 4 is shorter than 80 chars\n\nEXPECTED_JSON_OUTPUT = [\n  {\n    \"ID\": 12345,\n    \"NAME\": \"TEST USER\",\n    \"AMOUNT\": \"123.45\", # Decimals are output as strings\n    \"STATUS\": \" \", # Should be \"A\" - Issue in original parser logic's sign handling for PIC S9? No, Amount ends at 32, Status starts at 33. Let's fix sample data or layout.\n                   # Fixed layout: Amount length 7 (26-32), Status starts 33.\n                   # Correcting sample data: '01234E ' -\u003e '012345A', '09876N ' -\u003e '098765B', 'BADAMT ' -\u003e 'BADAMT C', '00050{ ' -\u003e '00050{ D'\n                   # Let's retry with updated data aligned to layout\n    \"UNUSED\": \"                                               \"\n   },\n  {\n    \"ID\": 1, # Leading zeros removed by int()\n    \"NAME\": \"ANOTHER ONE\",\n    \"AMOUNT\": \"-987.65\", # 9876N -\u003e 987.65, sign '-'\n    \"STATUS\": \"B\",\n    \"UNUSED\": \"                                               \"\n  },\n  # Line 3 skipped due to parsing error (BADAMT)\n  { # Line 4 Processed despite short length warning\n    \"ID\": 2,\n    \"NAME\": \"SHORT LINE\",\n    \"AMOUNT\": \"5.00\", # 00050{ -\u003e 000.50, sign '+' -\u003e 5.00\n    \"STATUS\": \"D\",\n    \"UNUSED\": \"                                               \" # Padded\n   }\n]\n\n# Re-generating correct COBOL data based on layout\nCOBOL_LAYOUT = {\n    \"record_length\": 80,\n    \"fields\": [\n        {\"name\": \"ID\", \"start_pos\": 1, \"length\": 5, \"type\": \"PIC 9\"},\n        {\"name\": \"NAME\", \"start_pos\": 6, \"length\": 20, \"type\": \"PIC X\", \"strip\": True},\n        {\"name\": \"AMOUNT\", \"start_pos\": 26, \"length\": 7, \"type\": \"PIC S9\", \"decimals\": 2, \"signed\": True}, # e.g., 123456E -\u003e +12345.65\n        {\"name\": \"STATUS\", \"start_pos\": 33, \"length\": 1, \"type\": \"PIC X\"},\n        {\"name\": \"UNUSED\", \"start_pos\": 34, \"length\": 47, \"type\": \"PIC X\"}\n    ]\n}\n# AMOUNT: PIC S9(5)V99 packed into 7 chars including sign. Assume trailing sign.\n# 12345.67 -\u003e 123456G (+)\n# -123.45  -\u003e 001234N (-)\n# 5.00     -\u003e 000050{ (+)\nCOBOL_DATA = \"\"\"\\\n12345TEST USER         123456G A                                                \\r\\n\\\n00001ANOTHER ONE       001234N B                                                \\n\\\n54321BAD DATA          BADAMTX C                                                \\r\\n\\\n00002SHORT LINE        000050{ D\n\"\"\" # line 4 still short\n\nEXPECTED_JSON_OUTPUT = [\n  {\n    \"ID\": 12345,\n    \"NAME\": \"TEST USER\",\n    \"AMOUNT\": \"12345.67\",\n    \"STATUS\": \"A\",\n    \"UNUSED\": \"                                               \"\n   },\n  {\n    \"ID\": 1,\n    \"NAME\": \"ANOTHER ONE\",\n    \"AMOUNT\": \"-123.45\",\n    \"STATUS\": \"B\",\n    \"UNUSED\": \"                                               \"\n  },\n  # Line 3 skipped due to parsing error (BADAMTX for S9V99)\n  { # Line 4 Processed despite short length warning\n    \"ID\": 2,\n    \"NAME\": \"SHORT LINE\",\n    \"AMOUNT\": \"5.00\",\n    \"STATUS\": \"D\",\n    \"UNUSED\": \"                                               \" # Padded\n   }\n]\n\n\nclass TestJsonConsCLI(unittest.TestCase):\n\n    def setUp(self):\n        \"\"\"Set up test fixtures, if any.\"\"\"\n        self.test_dir = tempfile.mkdtemp()\n        self.input_file_path = os.path.join(self.test_dir, 'input.json')\n        self.output_file_path = os.path.join(self.test_dir, 'output.json')\n        self.invalid_file_path = os.path.join(self.test_dir, 'invalid.json')\n        self.cobol_layout_path = os.path.join(self.test_dir, 'layout.json')\n        self.cobol_data_path = os.path.join(self.test_dir, 'cobol.dat')\n\n        # Sample valid JSON data\n        self.valid_data = {\"z\": 1, \"a\": 2, \"items\": [\"x\", \"y\"]}\n        self.valid_json_str = json.dumps(self.valid_data)\n        self.valid_json_pretty = json.dumps(self.valid_data, indent=2) + '\\n'\n        self.valid_json_pretty_sorted = json.dumps(self.valid_data, indent=2, sort_keys=True) + '\\n'\n\n        # Sample invalid JSON data\n        self.invalid_json_str = '{\"key\": \"value\", broken'\n\n        # Write sample files\n        with open(self.input_file_path, 'w') as f:\n            f.write(self.valid_json_str)\n        with open(self.invalid_file_path, 'w') as f:\n            f.write(self.invalid_json_str)\n        with open(self.cobol_layout_path, 'w') as f:\n            json.dump(COBOL_LAYOUT, f)\n        with open(self.cobol_data_path, 'w') as f:\n            f.write(COBOL_DATA)\n\n    def tearDown(self):\n        \"\"\"Tear down test fixtures, if any.\"\"\"\n        shutil.rmtree(self.test_dir)\n\n    def run_cli(self, args_list, stdin_data=None):\n        \"\"\"Helper function to run the CLI main function with specific args and stdin.\"\"\"\n        # Patch sys.argv\n        # Use 'jsoncons' or the actual script name if run directly\n        prog_name = 'jsoncons'\n        full_args = [prog_name] + args_list\n        # Use StringIO to capture stdout and stderr\n        stdout_capture = io.StringIO()\n        stderr_capture = io.StringIO()\n        # Patch stdin if stdin_data is provided\n        stdin_patch = None\n        original_stdin = sys.stdin\n        if stdin_data is not None:\n            sys.stdin = io.StringIO(stdin_data) # Directly replace sys.stdin\n\n        exit_code = 0\n        try:\n            # Patch stdout and stderr within the context manager\n            with patch('sys.argv', full_args), \\\n                 patch('sys.stdout', stdout_capture), \\\n                 patch('sys.stderr', stderr_capture):\n                cli.main()\n        except SystemExit as e:\n            exit_code = e.code\n        finally:\n            sys.stdin = original_stdin # Restore original stdin\n\n        return stdout_capture.getvalue(), stderr_capture.getvalue(), exit_code\n\n    # -- JSON Success Cases (encode/decode/process_json_fib) --\n\n    def test_encode_stdin_stdout_valid(self):\n        \"\"\"Test encode: reading valid JSON from stdin and writing to stdout.\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode'], stdin_data=self.valid_json_str)\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, self.valid_json_pretty) # Default indent is 2\n\n    def test_decode_stdin_stdout_valid(self):\n        \"\"\"Test decode: reading valid JSON from stdin and writing to stdout.\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['decode'], stdin_data=self.valid_json_str)\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, self.valid_json_pretty) # Default indent is 2\n\n    def test_process_json_fib_stdin_stdout_valid(self):\n        \"\"\"Test process_json_fib: reading valid JSON from stdin and writing to stdout.\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['process_json_fib'], stdin_data=self.valid_json_str)\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, self.valid_json_pretty) # Default indent is 2\n\n    def test_json_infile_outfile_valid(self):\n        \"\"\"Test reading valid JSON from file and writing to file (using encode).\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode', self.input_file_path, self.output_file_path])\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, '') # Should write to file, not stdout\n        self.assertTrue(os.path.exists(self.output_file_path))\n        with open(self.output_file_path, 'r') as f:\n            content = f.read()\n        self.assertEqual(content, self.valid_json_pretty)\n\n    def test_json_indent_option_4(self):\n        \"\"\"Test the --indent 4 option (using encode).\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode', '--indent', '4'], stdin_data=self.valid_json_str)\n        expected_output = json.dumps(self.valid_data, indent=4) + '\\n'\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, expected_output)\n\n    def test_json_indent_option_0_compact(self):\n        \"\"\"Test the --indent 0 option for compact output (using encode).\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode', '--indent', '0'], stdin_data=self.valid_json_str)\n        # Compact output (no indent) with a trailing newline\n        expected_output = json.dumps(self.valid_data, indent=None, separators=(',', ':')) + '\\n'\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, expected_output)\n\n    def test_json_sort_keys_option(self):\n        \"\"\"Test the --sort-keys option (using encode).\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode', '--sort-keys'], stdin_data=self.valid_json_str)\n        self.assertEqual(exit_code, 0)\n        self.assertEqual(stderr, '')\n        self.assertEqual(stdout, self.valid_json_pretty_sorted)\n\n\n    # -- JSON Error Cases --\n\n    def test_invalid_json_stdin(self):\n        \"\"\"Test reading invalid JSON from stdin (using encode).\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode'], stdin_data=self.invalid_json_str)\n        self.assertNotEqual(exit_code, 0, \"Exit code should be non-zero for invalid JSON\")\n        self.assertEqual(stdout, '')\n        self.assertIn(\"Error: Invalid JSON input\", stderr)\n        self.assertIn(\"stdin\", stderr)\n\n    def test_invalid_json_infile(self):\n        \"\"\"Test reading invalid JSON from a file (using encode).\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode', self.invalid_file_path])\n        self.assertNotEqual(exit_code, 0)\n        self.assertEqual(stdout, '')\n        self.assertIn(\"Error: Invalid JSON input\", stderr)\n        self.assertIn(f\"file '{self.invalid_file_path}'\", stderr)\n\n    def test_same_input_output_file(self):\n        \"\"\"Test error when input and output file paths are the same.\"\"\"\n        stdout, stderr, exit_code = self.run_cli(['encode', self.input_file_path, self.input_file_path])\n        self.assertNotEqual(exit_code, 0)\n        self.assertEqual(stdout, '')\n        self.assertIn(\"cannot be the same\", stderr)\n        self.assertIn(self.input_file_path, stderr)\n\n    # --- COBOL to JSON Tests ---\n\n    def test_cobol_to_json_success(self):\n        \"\"\"Test successful cobol_to_json conversion.\"\"\"\n        stdout, stderr, exit_code = self.run_cli([\n            'cobol_to_json',\n            '--layout-file', self.cobol_layout_path,\n            self.cobol_data_path,\n            self.output_file_path\n        ])\n        self.assertEqual(exit_code, 0)\n        # Check stderr for warnings about short line and parse error\n        self.assertIn(\"Expected length 80, got 34\", stderr) # Warning for line 4\n        self.assertIn(\"Error converting value 'BADAMTX'\", stderr) # Error for line 3\n        self.assertIn(\"Skipping line 3\", stderr) # Warning for skipping line 3\n        # Check output file content\n        self.assertTrue(os.path.exists(self.output_file_path))\n        with open(self.output_file_path, 'r') as f:\n            content = f.read()\n        # Parse the output JSON and the expected JSON for comparison\n        # Use object_pairs_hook with Decimal for comparison if needed, but str comparison works here\n        try:\n            output_data = json.loads(content) # Parse output back\n            expected_data_parsed = json.loads(json.dumps(EXPECTED_JSON_OUTPUT)) # Ensure expected is also parsed\n            self.assertEqual(output_data, expected_data_parsed)\n        except json.JSONDecodeError as e:\n            self.fail(f\"Output file content is not valid JSON: {e}\\nContent:\\n{content}\")\n\n\n    def test_cobol_to_json_fib_success(self):\n        \"\"\"Test successful cobol_to_json_fib conversion.\"\"\"\n        stdout, stderr, exit_code = self.run_cli([\n            'cobol_to_json_fib', # Use the fib command\n            '--layout-file', self.cobol_layout_path,\n            self.cobol_data_path,\n            self.output_file_path\n        ])\n        self.assertEqual(exit_code, 0)\n        # Check stderr for warnings (should be identical to non-fib version)\n        self.assertIn(\"Expected length 80, got 34\", stderr) # Warning for line 4\n        self.assertIn(\"Error converting value 'BADAMTX'\", stderr) # Error for line 3\n        self.assertIn(\"Skipping line 3\", stderr) # Warning for skipping line 3\n        # Check output file content (should be identical to non-fib version)\n        self.assertTrue(os.path.exists(self.output_file_path))\n        with open(self.output_file_path, 'r') as f:\n            content = f.read()\n        try:\n            output_data = json.loads(content)\n            expected_data_parsed = json.loads(json.dumps(EXPECTED_JSON_OUTPUT))\n            self.assertEqual(output_data, expected_data_parsed)\n        except json.JSONDecodeError as e:\n            self.fail(f\"Output file content is not valid JSON: {e}\\nContent:\\n{content}\")\n\n    def test_cobol_to_json_layout_not_found(self):\n        \"\"\"Test cobol_to_json with non-existent layout file.\"\"\"\n        stdout, stderr, exit_code = self.run_cli([\n            'cobol_to_json',\n            '--layout-file', 'nonexistent_layout.json',\n            self.cobol_data_path\n        ])\n        self.assertNotEqual(exit_code, 0)\n        self.assertIn(\"Layout file not found\", stderr)\n\n    def test_cobol_to_json_input_not_found(self):\n        \"\"\"Test cobol_to_json with non-existent input file.\"\"\"\n        # Argparse handles this before our code runs, but good to have a sense\n        # We need to bypass FileType check for this test, or check stderr directly\n        with patch('argparse.ArgumentParser.parse_args', side_effect=SystemExit(2)):\n             # Simulate argparse failing\n             stdout, stderr, exit_code = self.run_cli([\n                 'cobol_to_json',\n                 '--layout-file', self.cobol_layout_path,\n                 'nonexistent_cobol.dat'\n             ])\n             # Since we mocked parse_args, main doesn't fully run.\n             # This test mainly ensures the CLI structure handles missing files via argparse.\n             # A different approach would be needed to test the fallback error message inside process_cobol_to_json\n             self.assertEqual(exit_code, 2) # Argparse usually exits with 2 for bad arguments\n\n    # --- Fibonacci Hashing Function Tests (Directly testing the helper) ---\n    def test_fibonacci_hash_function(self):\n        \"\"\"Test the fibonacci_hash_to_index helper function directly.\"\"\"\n        # Test case from fibhash-brief.txt (3 bits -\u003e table size 8)\n        # Note: The text uses size_t hash input, let's use small ints\n        # The text's example shifts by 61 (64-3). Our function calculates shift.\n        table_size = 8\n        self.assertEqual(cli.fibonacci_hash_to_index(0, table_size), 0) # Expected 0\n        self.assertEqual(cli.fibonacci_hash_to_index(1, table_size), 4) # Expected 4\n        self.assertEqual(cli.fibonacci_hash_to_index(2, table_size), 1) # Expected 1 (Note: discrepancy possible due to exact constant/wrap)\n        self.assertEqual(cli.fibonacci_hash_to_index(3, table_size), 6) # Expected 6\n        self.assertEqual(cli.fibonacci_hash_to_index(4, table_size), 3) # Expected 3\n        self.assertEqual(cli.fibonacci_hash_to_index(5, table_size), 0) # Expected 0 (Collision with 0)\n        # Check edge cases\n        self.assertEqual(cli.fibonacci_hash_to_index(0xFFFFFFFFFFFFFFFF, table_size), 2) # Example calculation for large hash\n\n        # Test different table size\n        table_size_1024 = 1024\n        idx1 = cli.fibonacci_hash_to_index(123456789, table_size_1024)\n        idx2 = cli.fibonacci_hash_to_index(123456790, table_size_1024)\n        self.assertIsInstance(idx1, int)\n        self.assertGreaterEqual(idx1, 0)\n        self.assertLess(idx1, table_size_1024)\n        self.assertNotEqual(idx1, idx2) # Should likely be different for consecutive inputs\n\n        # Test invalid table size\n        with self.assertRaises(ValueError):\n            cli.fibonacci_hash_to_index(10, 7) # Not power of 2\n        with self.assertRaises(ValueError):\n            cli.fibonacci_hash_to_index(10, 0)\n        with self.assertRaises(ValueError):\n             cli.fibonacci_hash_to_index(10, -8)\n\n\nif __name__ == '__main__':\n    unittest.main()\n```\n\n### **4. Jupyter Notebook (`Fibonacci_Hashing_Demo.ipynb`)**\n\n```python\n{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Fibonacci Hashing Demonstration\\n\",\n    \"\\n\",\n    \"This notebook demonstrates the principles and performance characteristics of Fibonacci Hashing compared to standard modulo hashing for mapping hash values to hash table indices.\\n\",\n    \"\\n\",\n    \"**Concept:**\\n\",\n    \"Fibonacci hashing is a form of multiplicative hashing using a constant related to the golden ratio (`phi`). It maps a large hash value (e.g., 64-bit) into a smaller range (the size of a hash table, often a power of 2) using the formula:\\n\",\n    \"\\n\",\n    \"`index = (hash * MAGIC_CONSTANT) \u003e\u003e shift_amount`\\n\",\n    \"\\n\",\n    \"Where:\\n\",\n    \"*   `MAGIC_CONSTANT` is approximately `2^64 / phi` (specifically `11400714819323198485` for 64 bits).\\n\",\n    \"*   `shift_amount` is calculated based on the table size (`64 - log2(table_size)`).\\n\",\n    \"\\n\",\n    \"**Advantages (as per fibhash-brief.txt):**\\n\",\n    \"*   **Speed:** Very fast (integer multiplication and bit shift).\\n\",\n    \"*   **Distribution:** Mixes input bits well, reducing clustering compared to simple modulo or taking low bits (bitwise AND), especially with patterned input data.\\n\",\n    \"\\n\",\n    \"We will use data generated by the `jsoncons` tool (from a sample COBOL file) as input keys for our hashing demonstration.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import json\\n\",\n    \"import timeit\\n\",\n    \"import math\\n\",\n    \"import os\\n\",\n    \"import sys\\n\",\n    \"import random\\n\",\n    \"import subprocess\\n\",\n    \"import matplotlib.pyplot as plt\\n\",\n    \"import numpy as np\\n\",\n    \"\\n\",\n    \"# Assuming cli.py is in the parent directory or jsoncons is installed\\n\",\n    \"# If cli.py is in parent dir:\\n\",\n    \"sys.path.insert(0, os.path.abspath('..'))\\n\",\n    \"try:\\n\",\n    \"    from jsoncons import cli\\n\",\n    \"    print(\\\"Imported cli functions directly.\\\")\\n\",\n    \"    FIB_HASH_64_MAGIC = cli.FIB_HASH_64_MAGIC\\n\",\n    \"    fibonacci_hash_to_index = cli.fibonacci_hash_to_index\\n\",\n    \"except ImportError:\\n\",\n    \"    print(\\\"Could not import cli functions directly. Defining locally.\\\")\\n\",\n    \"    # Define constants and functions locally if import fails\\n\",\n    \"    FIB_HASH_64_MAGIC = 11400714819323198485\\n\",\n    \"\\n\",\n    \"    def fibonacci_hash_to_index(hash_value: int, table_size_power_of_2: int) -\u003e int:\\n\",\n    \"        if table_size_power_of_2 \u003c= 0 or (table_size_power_of_2 \u0026 (table_size_power_of_2 - 1)) != 0:\\n\",\n    \"            raise ValueError(\\\"table_size_power_of_2 must be a positive power of 2.\\\")\\n\",\n    \"        hash_value \u0026= 0xFFFFFFFFFFFFFFFF\\n\",\n    \"        magic_product = (hash_value * FIB_HASH_64_MAGIC) \u0026 0xFFFFFFFFFFFFFFFF\\n\",\n    \"        # +1 because bit_length(8) is 4, we need 3 bits -\u003e shift 64-3=61\\n\",\n    \"        # bit_length(1024) is 11, we need 10 bits -\u003e shift 64-10=54\\n\",\n    \"        shift_amount = 64 - (table_size_power_of_2.bit_length() -1) \\n\",\n    \"        return magic_product \u003e\u003e shift_amount\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## 1. Setup: Hashing Functions and Sample Data Generation\\n\",\n    \"\\n\",\n    \"First, let's define the hashing functions we want to compare and generate some sample JSON data using the `jsoncons` tool.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# --- Hashing Functions to Compare ---\\n\",\n    \"\\n\",\n    \"def modulo_hash_to_index(hash_value: int, table_size: int) -\u003e int:\\n\",\n    \"    \\\"\\\"\\\"Standard modulo for index mapping.\\\"\\\"\\\"\\n\",\n    \"    # Simulate non-power-of-2 table size scenario where modulo is needed\\n\",\n    \"    # For fair comparison, use a size slightly off power-of-2\\n\",\n    \"    return hash_value % table_size\\n\",\n    \"\\n\",\n    \"def bitwise_and_hash_to_index(hash_value: int, table_size_power_of_2: int) -\u003e int:\\n\",\n    \"    \\\"\\\"\\\"Bitwise AND for power-of-2 table size mapping (fast but uses low bits).\\\"\\\"\\\"\\n\",\n    \"    # Requires table_size to be power of 2\\n\",\n    \"    if table_size_power_of_2 \u003c= 0 or (table_size_power_of_2 \u0026 (table_size_power_of_2 - 1)) != 0:\\n\",\n    \"        raise ValueError(\\\"table_size_power_of_2 must be a positive power of 2.\\\")\\n\",\n    \"    return hash_value \u0026 (table_size_power_of_2 - 1)\\n\",\n    \"\\n\",\n    \"# --- Sample Data Setup ---\\n\",\n    \"# Create temporary COBOL layout and data files (similar to test_cli.py)\\n\",\n    \"temp_dir = \\\"./temp_hash_demo\\\"\\n\",\n    \"os.makedirs(temp_dir, exist_ok=True)\\n\",\n    \"\\n\",\n    \"cobol_layout_path = os.path.join(temp_dir, 'layout.json')\\n\",\n    \"cobol_data_path = os.path.join(temp_dir, 'cobol.dat')\\n\",\n    \"output_json_path = os.path.join(temp_dir, 'output.json')\\n\",\n    \"\\n\",\n    \"# Using the same layout and data as in test_cli.py\\n\",\n    \"cobol_layout = {\\n\",\n    \"    \\\"record_length\\\": 80,\\n\",\n    \"    \\\"fields\\\": [\\n\",\n    \"        {\\\"name\\\": \\\"ID\\\", \\\"start_pos\\\": 1, \\\"length\\\": 5, \\\"type\\\": \\\"PIC 9\\\"},\\n\",\n    \"        {\\\"name\\\": \\\"NAME\\\", \\\"start_pos\\\": 6, \\\"length\\\": 20, \\\"type\\\": \\\"PIC X\\\", \\\"strip\\\": True},\\n\",\n    \"        {\\\"name\\\": \\\"AMOUNT\\\", \\\"start_pos\\\": 26, \\\"length\\\": 7, \\\"type\\\": \\\"PIC S9\\\", \\\"decimals\\\": 2, \\\"signed\\\": True}, # e.g., 123456G -\u003e +12345.67\\n\",\n    \"        {\\\"name\\\": \\\"STATUS\\\", \\\"start_pos\\\": 33, \\\"length\\\": 1, \\\"type\\\": \\\"PIC X\\\"},\\n\",\n    \"        {\\\"name\\\": \\\"UNUSED\\\", \\\"start_pos\\\": 34, \\\"length\\\": 47, \\\"type\\\": \\\"PIC X\\\"}\\n\",\n    \"    ]\\n\",\n    \"}\\n\",\n    \"\\n\",\n    \"# Create more data for a better demo\\n\",\n    \"def generate_cobol_line(rec_id, name, amount_val, status):\\n\",\n    \"    id_str = str(rec_id).zfill(5)\\n\",\n    \"    name_str = name.ljust(20)\\n\",\n    \"    \\n\",\n    \"    # Convert amount to PIC S9(5)V99 format (7 chars, trailing sign)\\n\",\n    \"    sign = '+' if amount_val \u003e= 0 else '-'\\n\",\n    \"    num_part = str(abs(int(round(amount_val * 100)))).zfill(7) # 5 integer, 2 decimal digits\\n\",\n    \"    \\n\",\n    \"    sign_char = ''\\n\",\n    \"    last_digit = int(num_part[-1])\\n\",\n    \"    if sign == '+':\\n\",\n    \"        sign_char = chr(ord('{') + last_digit) # '{' for 0, A-I for 1-9 (EBCDIC style often mapped)\\n\",\n    \"        # Using ASCII friendly approximation: {ABCDEFGHI\\n\",\n    \"        sign_map_pos = \\\"{ABCDEFGHI\\\"\\n\",\n    \"        sign_char = sign_map_pos[last_digit] \\n\",\n    \"    else: # sign == '-'\\n\",\n    \"        sign_map_neg = \\\"}JKLMNOPQR\\\"\\n\",\n    \"        sign_char = sign_map_neg[last_digit]\\n\",\n    \"        \\n\",\n    \"    amount_str = num_part[:-1] + sign_char\\n\",\n    \"    amount_str = amount_str.ljust(7) # Ensure length\\n\",\n    \"    \\n\",\n    \"    status_str = status.ljust(1)\\n\",\n    \"    unused_str = \\\"\\\".ljust(47)\\n\",\n    \"    line = f\\\"{id_str}{name_str}{amount_str}{status_str}{unused_str}\\\"\\n\",\n    \"    # Ensure exact record length if layout specifies it\\n\",\n    \"    # line = line.ljust(cobol_layout['record_length'])\\n\",\n    \"    return line + \\\"\\\\n\\\" # Add newline\\n\",\n    \"\\n\",\n    \"NUM_RECORDS = 5000\\n\",\n    \"cobol_data_content = \\\"\\\"\\n\",\n    \"random.seed(42) # for reproducible names/amounts\\n\",\n    \"for i in range(1, NUM_RECORDS + 1):\\n\",\n    \"    # Add some sequential and some random IDs\\n\",\n    \"    rec_id = i if i % 10 != 0 else random.randint(NUM_RECORDS, NUM_RECORDS * 2)\\n\",\n    \"    name = f\\\"USER {random.randint(100, 999)}\\\"\\n\",\n    \"    amount = random.uniform(-5000, 20000)\\n\",\n    \"    status = random.choice(['A', 'B', 'C', 'I', 'X'])\\n\",\n    \"    cobol_data_content += generate_cobol_line(rec_id, name, amount, status)\\n\",\n    \"\\n\",\n    \"# Add a known bad line\\n\",\n    \"cobol_data_content += \\\"54321BAD DATA          BADAMTX C                                                \\\\n\\\"\\n\",\n    \"\\n\",\n    \"with open(cobol_layout_path, 'w') as f:\\n\",\n    \"    json.dump(cobol_layout, f)\\n\",\n    \"with open(cobol_data_path, 'w') as f:\\n\",\n    \"    f.write(cobol_data_content)\\n\",\n    \"\\n\",\n    \"# --- Run jsoncons to generate JSON data ---\\n\",\n    \"# Check if cli.py exists to run as script, otherwise assume installed\\n\",\n    \"cli_script_path = os.path.abspath('../jsoncons/cli.py')\\n\",\n    \"command_base = [sys.executable, cli_script_path] if os.path.exists(cli_script_path) else ['jsoncons']\\n\",\n    \"\\n\",\n    \"try:\\n\",\n    \"    print(f\\\"Running: {' '.join(command_base + ['cobol_to_json', '--layout-file', cobol_layout_path, cobol_data_path, output_json_path])}\\\")\\n\",\n    \"    # Use subprocess to run the command\\n\",\n    \"    result = subprocess.run(command_base + ['cobol_to_json', \\n\",\n    \"                                             '--layout-file', cobol_layout_path, \\n\",\n    \"                                             cobol_data_path, \\n\",\n    \"                                             output_json_path], \\n\",\n    \"                            capture_output=True, text=True, check=True)\\n\",\n    \"    print(\\\"jsoncons executed successfully.\\\")\\n\",\n    \"    # print(\\\"stderr:\\\", result.stderr)\\n\",\n    \"except FileNotFoundError:\\n\",\n    \"    print(f\\\"Error: Could not find {' '.join(command_base)}. Make sure jsoncons is installed or cli.py path is correct.\\\")\\n\",\n    \"    data = [] # Set data to empty list to avoid errors later\\n\",\n    \"except subprocess.CalledProcessError as e:\\n\",\n    \"    print(f\\\"Error running jsoncons: {e}\\\")\\n\",\n    \"    print(f\\\"stderr: {e.stderr}\\\")\\n\",\n    \"    data = []\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(f\\\"An unexpected error occurred: {e}\\\")\\n\",\n    \"    data = []\\n\",\n    \"\\n\",\n    \"# Load the generated JSON data\\n\",\n    \"try:\\n\",\n    \"    with open(output_json_path, 'r') as f:\\n\",\n    \"        # Use Decimal to load amounts precisely if needed for hashing\\n\",\n    \"        # data = json.load(f, parse_float=decimal.Decimal, parse_int=...) \\n\",\n    \"        data = json.load(f)\\n\",\n    \"    print(f\\\"Successfully loaded {len(data)} records from {output_json_path}\\\")\\n\",\n    \"    # print(\\\"First few records:\\\", data[:3])\\n\",\n\n    \"except FileNotFoundError:\\n\",\n    \"    print(f\\\"Error: Output file {output_json_path} not found. jsoncons might have failed.\\\")\\n\",\n    \"    data = []\\n\",\n    \"except json.JSONDecodeError as e:\\n\",\n    \"    print(f\\\"Error decoding JSON from {output_json_path}: {e}\\\")\\n\",\n    \"    data = []\\n\",\n    \"\\n\",\n    \"# Prepare keys and hashes for the benchmark\\n\",\n    \"if data:\\n\",\n    \"    # Use the 'ID' field as the key for hashing\\n\",\n    \"    keys = [record['ID'] for record in data if 'ID' in record] \\n\",\n    \"    # Pre-compute Python's built-in hash (result depends on Python version/platform)\\n\",\n    \"    # Treat these as our 'good' 64-bit hash inputs for mapping demonstration\\n\",\n    \"    hashes = [hash(key) for key in keys]\\n\",\n    \"    print(f\\\"Prepared {len(hashes)} hash values for benchmarking.\\\")\\n\",\n    \"else:\\n\",\n    \"    print(\\\"No data loaded, skipping hash preparation.\\\")\\n\",\n    \"    hashes = []\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## 2. Performance Benchmark: Mapping Hashes to Indices\\n\",\n    \"\\n\",\n    \"Now, let's measure the time taken to map the pre-computed hash values to indices for a hash table using the different methods.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"if hashes:\\n\",\n    \"    # Choose table sizes\\n\",\n    \"    TABLE_SIZE_POW2 = 1024 * 16 # Power of 2 for fibonacci and bitwise AND\\n\",\n    \"    TABLE_SIZE_NON_POW2 = TABLE_SIZE_POW2 - 1 # Slightly different size for modulo\\n\",\n    \"    \\n\",\n    \"    num_iterations = 100 # Number of times to repeat the mapping loop\\n\",\n    \"    num_hashes = len(hashes)\\n\",\n    \"    \\n\",\n    \"    print(f\\\"Benchmarking mapping {num_hashes} hashes {num_iterations} times...\\\")\\n\",\n    \"    print(f\\\"Table size (Power of 2): {TABLE_SIZE_POW2}\\\")\\n\",\n    \"    print(f\\\"Table size (Non-Power of 2): {TABLE_SIZE_NON_POW2}\\\")\\n\",\n    \"    \\n\",\n    \"    # --- Time Fibonacci Hashing ---\\n\",\n    \"    fib_time = timeit.timeit(\\n\",\n    \"        stmt='[fibonacci_hash_to_index(h, TABLE_SIZE_POW2) for h in hashes]', \\n\",\n    \"        globals=globals(), \\n\",\n    \"        number=num_iterations\\n\",\n    \"    )\\n\",\n    \"    print(f\\\"Fibonacci Hashing Time: {fib_time:.6f} seconds\\\")\\n\",\n    \"\\n\",\n    \"    # --- Time Modulo Hashing ---\\n\",\n    \"    mod_time = timeit.timeit(\\n\",\n    \"        stmt='[modulo_hash_to_index(h, TABLE_SIZE_NON_POW2) for h in hashes]', \\n\",\n    \"        globals=globals(), \\n\",\n    \"        number=num_iterations\\n\",\n    \"    )\\n\",\n    \"    print(f\\\"Modulo Hashing Time:    {mod_time:.6f} seconds\\\")\\n\",\n    \"\\n\",\n    \"    # --- Time Bitwise AND Hashing ---\\n\",\n    \"    and_time = timeit.timeit(\\n\",\n    \"        stmt='[bitwise_and_hash_to_index(h, TABLE_SIZE_POW2) for h in hashes]', \\n\",\n    \"        globals=globals(), \\n\",\n    \"        number=num_iterations\\n\",\n    \"    )\\n\",\n    \"    print(f\\\"Bitwise AND Hashing Time:{and_time:.6f} seconds\\\")\\n\",\n    \"else:\\n\",\n    \"    print(\\\"No hashes generated, skipping benchmark.\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"**Benchmark Interpretation:**\\n\",\n    \"\\n\",\n    \"*   **Fibonacci vs. Modulo:** Fibonacci hashing is typically significantly faster than the modulo operator (`%`) when the divisor isn't known at compile time (as simulated here with `TABLE_SIZE_NON_POW2`).\\n\",\n    \"*   **Fibonacci vs. Bitwise AND:** Fibonacci hashing (multiply + shift) is slightly slower than a simple bitwise AND, but the difference is usually very small. The key advantage of Fibonacci is its superior distribution quality, which often outweighs the minor speed difference.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## 3. Distribution Analysis\\n\",\n    \"\\n\",\n    \"Let's visualize how well each method distributes the hash values across the available indices. Ideally, we want a uniform distribution to minimize collisions in a hash table.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {\n    \"scrolled\": true\n   },\n   \"outputs\": [],\n   \"source\": [\n    \"if hashes:\\n\",\n    \"    # Calculate indices using each method\\n\",\n    \"    fib_indices = [fibonacci_hash_to_index(h, TABLE_SIZE_POW2) for h in hashes]\\n\",\n    \"    # For modulo, use the power-of-2 size for direct visual comparison with others\\n\",\n    \"    mod_indices = [modulo_hash_to_index(h, TABLE_SIZE_POW2) for h in hashes] \\n\",\n    \"    and_indices = [bitwise_and_hash_to_index(h, TABLE_SIZE_POW2) for h in hashes]\\n\",\n    \"    \\n\",\n    \"    # Plot histograms\\n\",\n    \"    num_bins = min(TABLE_SIZE_POW2 // 4, 100) # Adjust number of bins for clarity\\n\",\n    \"    \\n\",\n    \"    fig, axes = plt.subplots(3, 1, figsize=(12, 15), sharex=True, sharey=True)\\n\",\n    \"    fig.suptitle(f'Distribution of Hash Indices (Table Size = {TABLE_SIZE_POW2})', fontsize=16)\\n\",\n    \"    \\n\",\n    \"    axes[0].hist(fib_indices, bins=num_bins, color='skyblue', edgecolor='black')\\n\",\n    \"    axes[0].set_title('Fibonacci Hashing')\\n\",\n    \"    axes[0].set_ylabel('Frequency')\\n\",\n    \"    \\n\",\n    \"    axes[1].hist(mod_indices, bins=num_bins, color='lightcoral', edgecolor='black')\\n\",\n    \"    axes[1].set_title('Modulo Hashing (%)')\\n\",\n    \"    axes[1].set_ylabel('Frequency')\\n\",\n    \"    \\n\",\n    \"    axes[2].hist(and_indices, bins=num_bins, color='lightgreen', edgecolor='black')\\n\",\n    \"    axes[2].set_title('Bitwise AND Hashing (\u0026)')\\n\",\n    \"    axes[2].set_xlabel('Hash Table Index')\\n\",\n    \"    axes[2].set_ylabel('Frequency')\\n\",\n    \"    \\n\",\n    \"    plt.tight_layout(rect=[0, 0.03, 1, 0.96]) # Adjust layout to prevent title overlap\\n\",\n    \"    plt.show()\\n\",\n    \"else:\\n\",\n    \"    print(\\\"No hashes generated, skipping distribution analysis.\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"**Distribution Interpretation:**\\n\",\n    \"\\n\",\n    \"*   **Fibonacci Hashing:** Should show a relatively flat, uniform distribution, indicating good spreading of hash values.\\n\",\n    \"*   **Modulo Hashing:** Might also show good distribution if the input hashes are already well-distributed. However, it can perform poorly if there are patterns in the hash values that align with the table size.\\n\",\n    \"*   **Bitwise AND:** This method only uses the lower bits of the hash. If the higher bits contained important variation (or if the lower bits have poor distribution from the original hash function), this method can lead to significant clustering and poor performance. Fibonacci hashing avoids this by mixing *all* bits.\\n\",\n    \"\\n\",\n    \"The inclusion of sequential IDs in our generated data (`rec_id = i`) might slightly favor Fibonacci/Modulo over pure Bitwise AND if Python's `hash()` for small integers results in patterned low bits.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## 4. Conclusion\\n\",\n    \"\\n\",\n    \"This notebook demonstrated that:\\n\",\n    \"\\n\",\n    \"1.  **Fibonacci hashing is a fast method** for mapping hash values to table indices, significantly outperforming standard modulo and being comparable in speed to bitwise AND.\\n\",\n    \"2.  **Fibonacci hashing provides excellent distribution**, mixing the input hash bits effectively. This makes it more robust than bitwise AND against poor hash functions or patterned inputs, leading to fewer collisions and better overall hash table performance.\\n\",\n    \"\\n\",\n    \"While we added `_fib` variants to the `jsoncons` tool's commands and functions, the core COBOL parsing and JSON formatting logic **does not directly benefit** from Fibonacci hashing. The real advantage lies in using this technique *within* data structures that rely on hashing, such as hash maps/dictionaries, as demonstrated by the benchmark and distribution analysis above.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# Cleanup temporary files\\n\",\n    \"try:\\n\",\n    \"    import shutil\\n\",\n    \"    if os.path.exists(temp_dir):\\n\",\n    \"        shutil.rmtree(temp_dir)\\n\",\n    \"        print(f\\\"Cleaned up temporary directory: {temp_dir}\\\")\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(f\\\"Error during cleanup: {e}\\\")\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.9.7\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 4\n}\n```\n## Dev Notes\n### **Explanation:**\n\n1.  **`cli.py` Changes:**\n    *   Added the `fibonacci_hash_to_index` helper function based on the text/image.\n    *   Created `parse_cobol_line_fib` and `process_cobol_to_json_fib` which *call* or *reuse* the original parsing logic. They exist to provide the separate CLI endpoint.\n    *   Created `process_json_fib` as an alias to `process_json`.\n    *   Added new subparsers (`cobol_to_json_fib`, `process_json_fib`) in `main()` that point to these new functions.\n\n2.  **`test_cli.py` Changes:**\n    *   Added sample COBOL layout and data (corrected based on initial analysis).\n    *   Added tests for the new `cobol_to_json_fib` command, ensuring its output is identical to the original `cobol_to_json` given the underlying parsing logic is the same.\n    *   Added a simple test for `process_json_fib` to ensure it behaves like `encode`/`decode`.\n    *   Added direct unit tests for the `fibonacci_hash_to_index` helper function itself to verify its logic against expected values and edge cases.\n\n3.  **`Fibonacci_Hashing_Demo.ipynb`:**\n    *   **Explains:** Clearly states the purpose and benefits of Fibonacci hashing.\n    *   **Sets up:** Defines the hashing functions (Fibonacci, Modulo, Bitwise AND) and generates sample COBOL data, then uses `jsoncons` (via subprocess) to create the input JSON data.\n    *   **Benchmarks:** Uses `timeit` to compare the speed of mapping a list of pre-computed hash values (derived from the JSON data's IDs) to table indices using the three different methods.\n    *   **Analyzes Distribution:** Creates histograms to visually compare how uniformly each method spreads the indices across the hash table range.\n    *   **Concludes:** Summarizes the findings regarding speed and distribution, reiterating that the benefit is in the hash-to-index mapping stage, not the parsing/formatting itself.\n    *   **Cleanup:** Removes temporary files.\n\nThis solution provides the requested code structure, tests, and a practical demonstration notebook explaining and verifying the concepts discussed in the provided materials.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffapulito%2Fjsoncons","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffapulito%2Fjsoncons","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffapulito%2Fjsoncons/lists"}