{"id":20094760,"url":"https://github.com/turborium/sse2sample","last_synced_at":"2025-05-06T04:32:48.767Z","repository":{"id":246415191,"uuid":"820726034","full_name":"turborium/SSE2Sample","owner":"turborium","description":"Example of using SSE2","archived":false,"fork":false,"pushed_at":"2024-06-28T15:42:01.000Z","size":3153,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-09T08:44:32.098Z","etag":null,"topics":["assembly","delphi","demo","freepascal","graphics-programming","intrinsics","objectpascal","pascal","sample","sse2","x86-assembly"],"latest_commit_sha":null,"homepage":"","language":"Pascal","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/turborium.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-27T04:08:57.000Z","updated_at":"2024-08-17T13:57:53.000Z","dependencies_parsed_at":"2024-06-27T21:49:47.098Z","dependency_job_id":"d64dd545-b736-441f-9399-9fba2bfb1a23","html_url":"https://github.com/turborium/SSE2Sample","commit_stats":null,"previous_names":["turborium/sse2sample"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/turborium%2FSSE2Sample","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/turborium%2FSSE2Sample/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/turborium%2FSSE2Sample/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/turborium%2FSSE2Sample/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/turborium","download_url":"https://codeload.github.com/turborium/SSE2Sample/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252622717,"owners_count":21778064,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly","delphi","demo","freepascal","graphics-programming","intrinsics","objectpascal","pascal","sample","sse2","x86-assembly"],"created_at":"2024-11-13T16:52:13.556Z","updated_at":"2025-05-06T04:32:48.228Z","avatar_url":"https://github.com/turborium.png","language":"Pascal","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SSE2Sample\nExample of using sse2\n\n![scr](scr.png)\n\nDemo shows the difference between fade code versions and optimizations.  \nAlso this is an example of outputting raw pixels using the WinApi StretchDIBits function.  \n\n[Precompiled Windows EXE (64 bit)](https://github.com/turborium/SSE2Sample/raw/main/SSE2Sample.exe)\n\nPowered by [Delphi](https://www.embarcadero.com/products/delphi) with [ObjectPascal](https://www.danysoft.com/estaticos/free/Libros%20en%20formato%20GRATUITO/ObjectPascalHandbook_AlexandriaVersion.pdf) [(RU)](https://lp.embarcadero.com/RU-ObjectPascalEbook), get free Delphi 11 Community Edition [here](https://www.embarcadero.com/products/delphi/starter).\n\n## Results\nMethod | x86 (32 bit)  | x64 (64 bit) |\n------ | ------------- | ------------- |\nSimple | 118 FPS  | 123 FPS  |\nLoopUnroll | 122 FPS  | 116 FPS  |\nLoopUnrollPtr | 105 FPS | 160 FPS  |\nSSE2 | 374 FPS  | 432 FPS |\n\nFor optimization purposes, the following directives are used:\n```Pascal\n{$POINTERMATH ON} // разрешаем работу с указателями\n{$OVERFLOWCHECKS OFF} // отключаем проверку переполнения чисел\n{$RANGECHECKS OFF} // отключаем проверку диапазонов\n```\n\n**Simple** code:\n```Pascal\nprocedure FadeBufferSimple(Data: PByte; Count: Integer; Level: Byte);\nvar\n  I: Integer;\nbegin\n  // простой фейдинг пикселов\n  for I := 0 to Count - 1 do\n  begin\n    Data[I] := Max(0, Data[I] - Level);\n  end;\nend;\n```\n\n**LoopUnroll** code:\n```Pascal\nprocedure FadeBufferLoopUnroll(Data: PByte; Count: Integer; Level: Byte);\nvar\n  I, ChunkCount, Index: Integer;\nbegin\n  // считаем кол-во полных 16-байтных чанков\n  ChunkCount := Count div 16;\n\n  // фейдинг чанков используя \"раскрутку цикла\"\n  Index := 0;\n  for I := 0 to ChunkCount - 1 do\n  begin\n    Data[Index +  0] := Max(0, Data[Index +  0] - Level);\n    Data[Index +  1] := Max(0, Data[Index +  1] - Level);\n    Data[Index +  2] := Max(0, Data[Index +  2] - Level);\n    Data[Index +  3] := Max(0, Data[Index +  3] - Level);\n    Data[Index +  4] := Max(0, Data[Index +  4] - Level);\n    Data[Index +  5] := Max(0, Data[Index +  5] - Level);\n    Data[Index +  6] := Max(0, Data[Index +  6] - Level);\n    Data[Index +  7] := Max(0, Data[Index +  7] - Level);\n    Data[Index +  8] := Max(0, Data[Index +  8] - Level);\n    Data[Index +  9] := Max(0, Data[Index +  9] - Level);\n    Data[Index + 10] := Max(0, Data[Index + 10] - Level);\n    Data[Index + 11] := Max(0, Data[Index + 11] - Level);\n    Data[Index + 12] := Max(0, Data[Index + 12] - Level);\n    Data[Index + 13] := Max(0, Data[Index + 13] - Level);\n    Data[Index + 14] := Max(0, Data[Index + 14] - Level);\n    Data[Index + 15] := Max(0, Data[Index + 15] - Level);\n\n    Index := Index + 16;\n  end;\n\n  // фейдинг последнего кусочка чанка\n  for I := ChunkCount * 16 to Count - 1 do\n    Data[I] := Max(0, Data[I] - Level);\nend;\n```\n\n**LoopUnrollPtr** code:\n```Pascal\nprocedure FadeBufferLoopUnrollPtr(Data: PByte; Count: Integer; Level: Byte);\nvar\n  I, ChunkCount: Integer;\nbegin\n  // считаем кол-во полных 16-байтных чанков\n  ChunkCount := Count div 16;\n\n  // фейдинг чанков используя \"раскрутку цикла\" и указатели\n  for I := 0 to ChunkCount - 1 do\n  begin\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n  end;\n\n  // фейдинг последнего кусочка чанка\n  for I := ChunkCount * 16 to Count - 1 do\n  begin\n    Data^ := Max(0, Data^ - Level); Inc(Data);\n  end;\nend;\n```\n\n**SSE2** code:\n```Pascal\nprocedure Fade16BytesSSE2(SourceVector, FadeVector: Pointer);\nasm\n  // Загрузка данных из SourceVector, FadeVector в xmm0 и xmm1\n  movdqu xmm0, dqword ptr [SourceVector]\n  movdqu xmm1, dqword ptr [FadeVector]\n\n  // Вычитание с насыщением (saturated subtraction)\n  psubusb xmm0, xmm1\n\n  // Запись данных из xmm0 обратно в SourceVector\n  movdqu dqword ptr [SourceVector], xmm0\nend;\n\nprocedure FadeBufferSSE2(Data: PByte; Count: Integer; Level: Byte);\nvar\n  FadeVector: packed array [0..15] of Byte;\n  I, ChunkCount: Integer;\nbegin\n  // создаем вектор с 16 байтами уровня фейдинга\n  for I := 0 to 16 - 1 do\n    FadeVector[I] := Level;\n\n  // считаем кол-во полных 16-байтных чанков\n  ChunkCount := Count div 16;\n\n  // фейдинг чанков используя SSE2\n  for I := 0 to ChunkCount - 1 do\n    Fade16BytesSSE2(@Data[I * 16], @FadeVector[0]);\n\n  // фейдинг последнего кусочка чанка\n  for I := ChunkCount * 16 to Count - 1 do\n    Data[I] := Max(0, Data[I] - Level);\nend;\n```\n\n---\n\n## Вариации эффекта от ChatGPT 4o:\n\n### Первая вариация\n\u003e Вот обновленный код для процедуры DrawEffect, который включает более утонченные цветовые переходы и более плавные формы для улучшения визуального эффекта.\n\u003e Изменения включают:\n\u003e - Уменьшенное количество спреев (SprayCount) и точек в каждом спрее (PointInSprayCount).\n\u003e - Плавные цветовые переходы, используя тригонометрические функции для базового цвета.\n\u003e - Легкие вариации основного цвета для каждой точки, чтобы избежать резких цветовых переходов.\n```Pascal\nprocedure DrawEffect(Pixels: Pointer; Width, Height: Integer; var Time: Double; FadeMethod: TFadeMethod);\nconst\n  SprayCount = 50;  // Уменьшил количество спреев для более плавного эффекта\n  PointInSprayCount = 15;  // Уменьшил количество точек в одном спрее для большей детализации\n  SprayDeltaTime = 0.05;  // Увеличил временной промежуток между спреями для плавности\n  DeltaTime = 0.03;\n  FadeLevel = 2;\nvar\n  I, J: Integer;\n  X, Y: Double;\n  ScreenX, ScreenY: Integer;\n  T: Double;\n  BaseColor: TColor;\n  R, G, B: Byte;\nbegin\n  case FadeMethod of\n    TFadeMethod.Simple: FadeBufferSimple(Pixels, Width * Height * 4, FadeLevel);\n    TFadeMethod.LoopUnroll: FadeBufferLoopUnroll(Pixels, Width * Height * 4, FadeLevel);\n    TFadeMethod.LoopUnrollPtr: FadeBufferLoopUnrollPtr(Pixels, Width * Height * 4, FadeLevel);\n    TFadeMethod.SSE2: FadeBufferSSE2(Pixels, Width * Height * 4, FadeLevel);\n    else raise EAbstractError.Create('Bad FadeMethod');\n  end;\n\n  T := Time;\n  for I := 0 to SprayCount - 1 do\n  begin\n    X := 0.2 * (Cos(T) + Sin(T * 0.342 + 0.33) + Sin(T * 3.523)) * Width + 0.5 * Width;\n    Y := 0.2 * (Sin(T * 0.643) + Cos(T * 0.124 + 0.15) + Sin(T * 2.423)) * Height + 0.5 * Height;\n    ScreenX := Trunc(X);\n    ScreenY := Trunc(Y);\n\n    // Основной цвет спрея\n    BaseColor := RGB(128 + Round(127 * Sin(T)), 128 + Round(127 * Sin(T * 1.3)), 128 + Round(127 * Sin(T * 1.7)));\n\n    for J := 0 to PointInSprayCount - 1 do\n    begin\n      ScreenX := ScreenX + (Random(21 + J) - 10 - J div 2);\n      ScreenY := ScreenY + (Random(21 + J) - 10 - J div 2);\n      if (ScreenX \u003c 0) or (ScreenX \u003e= Width) or (ScreenY \u003c 0) or (ScreenY \u003e= Height) then\n        continue;\n\n      // Изменение оттенка основного цвета\n      R := GetRValue(BaseColor) + Random(51) - 25;\n      G := GetGValue(BaseColor) + Random(51) - 25;\n      B := GetBValue(BaseColor) + Random(51) - 25;\n\n      PUInt32(Pixels)[ScreenX + ScreenY * Width] := RGB(R, G, B);\n    end;\n    T := T + SprayDeltaTime;\n  end;\n\n  Time := Time + DeltaTime;\nend;\n```\n\n### Вторая вариация\n\u003e Вот еще несколько идей для улучшения визуального эффекта:  \n\u003e   Добавление градиентных переходов:  \n\u003e   Введение градиентных переходов между цветами для создания более гладкого и плавного эффекта.\n\u003e \n\u003e   Использование более сложных функций для координат:  \n\u003e   Применение более сложных функций для вычисления координат точек, чтобы создать интересные узоры и формы.\n\u003e\n\u003e   Добавление альфа-канала:  \n\u003e   Использование альфа-канала для создания эффекта прозрачности.\n\u003e\n\u003e   Изменение размера точек:  \n\u003e   Варьирование размеров точек для создания эффекта глубины.\n```Pascal\nprocedure BlendPixel(Pixels: PByte; X, Y, Width: Integer; R, G, B, A: Byte);\nvar\n  Index: Integer;\n  DestR, DestG, DestB: Byte;\nbegin\n  if (X \u003c 0) or (X \u003e= Width) or (Y \u003c 0) then Exit;\n  Index := (X + Y * Width) * 4;\n\n  DestB := Pixels[Index];\n  DestG := Pixels[Index + 1];\n  DestR := Pixels[Index + 2];\n\n  Pixels[Index] := (B * A + DestB * (255 - A)) div 255;\n  Pixels[Index + 1] := (G * A + DestG * (255 - A)) div 255;\n  Pixels[Index + 2] := (R * A + DestR * (255 - A)) div 255;\nend;\n\nprocedure DrawEffect(Pixels: Pointer; Width, Height: Integer; var Time: Double; FadeMethod: TFadeMethod);\nconst\n  SprayCount = 50;\n  PointInSprayCount = 15;\n  SprayDeltaTime = 0.05;\n  DeltaTime = 0.03;\n  FadeLevel = 2;\nvar\n  I, J: Integer;\n  X, Y: Double;\n  ScreenX, ScreenY: Integer;\n  T: Double;\n  BaseColor: TColor;\n  R, G, B, A: Byte;\n  Size: Integer;\nbegin\n  case FadeMethod of\n    TFadeMethod.Simple: FadeBufferSimple(Pixels, Width * Height * 4, FadeLevel);\n    TFadeMethod.LoopUnroll: FadeBufferLoopUnroll(Pixels, Width * Height * 4, FadeLevel);\n    TFadeMethod.LoopUnrollPtr: FadeBufferLoopUnrollPtr(Pixels, Width * Height * 4, FadeLevel);\n    TFadeMethod.SSE2: FadeBufferSSE2(Pixels, Width * Height * 4, FadeLevel);\n    else raise EAbstractError.Create('Bad FadeMethod');\n  end;\n\n  T := Time;\n  for I := 0 to SprayCount - 1 do\n  begin\n    X := 0.2 * (Cos(T) + Sin(T * 0.342 + 0.33) + Sin(T * 3.523)) * Width + 0.5 * Width;\n    Y := 0.2 * (Sin(T * 0.643) + Cos(T * 0.124 + 0.15) + Sin(T * 2.423)) * Height + 0.5 * Height;\n    ScreenX := Trunc(X);\n    ScreenY := Trunc(Y);\n\n    // Основной цвет спрея с градиентом\n    BaseColor := RGB(128 + Round(127 * Sin(T)), 128 + Round(127 * Sin(T * 1.3)), 128 + Round(127 * Sin(T * 1.7)));\n\n    for J := 0 to PointInSprayCount - 1 do\n    begin\n      ScreenX := ScreenX + (Random(21 + J) - 10 - J div 2);\n      ScreenY := ScreenY + (Random(21 + J) - 10 - J div 2);\n      if (ScreenX \u003c 0) or (ScreenX \u003e= Width) or (ScreenY \u003c 0) or (ScreenY \u003e= Height) then\n        continue;\n\n      // Изменение оттенка основного цвета с градиентом\n      R := GetRValue(BaseColor) + Random(51) - 25;\n      G := GetGValue(BaseColor) + Random(51) - 25;\n      B := GetBValue(BaseColor) + Random(51) - 25;\n      A := 255 - Round(255 * (J / PointInSprayCount));  // Альфа-канал\n\n      Size := Random(3) + 1;  // Изменение размера точек\n\n      // Рисуем точку с учетом альфа-канала и размера\n      for var DX := -Size to Size do\n        for var DY := -Size to Size do\n          if (ScreenX + DX \u003e= 0) and (ScreenX + DX \u003c Width) and (ScreenY + DY \u003e= 0) and (ScreenY + DY \u003c Height) then\n            BlendPixel(Pixels, ScreenX + DX, ScreenY + DY, Width, R, G, B, A);\n    end;\n    T := T + SprayDeltaTime;\n  end;\n\n  Time := Time + DeltaTime;\nend;\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fturborium%2Fsse2sample","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fturborium%2Fsse2sample","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fturborium%2Fsse2sample/lists"}