修正板端性能测试误判并加速 framebuffer 提交
板端测试发现轻量 Demo 在未优化 ARM 构建下只有十几帧,但 Release 构建后可达到约 76 FPS。问题主要来自单配置 CMake 构建未默认启用 Release,以及 fb0 提交路径逐像素通用转换成本过高。 本次变更将单配置生成器默认构建类型设为 Release,避免 ARM/fb0 性能测试误用未优化构建;同时为 FBDisplay 增加 RGB565、ARGB8888/XRGB8888、RGBA8888 连续内存等常见格式快速路径。Demo 临时移除旋转正方体,保留 2D sprite、tilemap、FPS 和 Frame/Present 耗时显示,用于定位板端 framebuffer 提交瓶颈。 Constraint: IMX6U framebuffer 性能结论必须基于 Release 构建 Constraint: 核心代码保持 C++11 兼容,不引入新依赖 Rejected: 直接判断板子性能不足 | 未优化构建会严重放大逐像素循环成本 Rejected: 继续优化 3D 正方体路径 | 当前瓶颈已由 Frame/Present 计时证明主要在 fb0 present Confidence: high Scope-risk: moderate Directive: 后续板端性能测试先确认 CMAKE_BUILD_TYPE=Release,再分析算法瓶颈 Directive: 2D-only 场景不要使用会清 depth buffer 的 DrawContext::clear() Tested: cmake --build build-win --config Release Tested: wsl bash -lc "cd /mnt/d/source/IMX6U-Game && cmake --build build-arm-fb" Not-tested: 实机 framebuffer 像素格式以外的非常见 fb0 layout
This commit is contained in:
parent
a9bc9a59fb
commit
d92b890528
|
|
@ -57,7 +57,11 @@ AGENTS.md
|
||||||
|
|
||||||
build-win
|
build-win
|
||||||
build-linux
|
build-linux
|
||||||
|
build-arm-fb
|
||||||
|
build-arm-sdl
|
||||||
|
|
||||||
.idea
|
.idea
|
||||||
|
|
||||||
assets/test
|
assets/test
|
||||||
|
|
||||||
|
gcc-linaro-4.9.4-2017.01-x86_64_arm-linux-gnueabihf.tar.xz
|
||||||
|
|
@ -4,6 +4,10 @@ project(IMX6U-Game)
|
||||||
set(CMAKE_CXX_STANDARD 11)
|
set(CMAKE_CXX_STANDARD 11)
|
||||||
set(CMAKE_CXX_STANDARD_REQUIRED ON)
|
set(CMAKE_CXX_STANDARD_REQUIRED ON)
|
||||||
|
|
||||||
|
if(NOT CMAKE_CONFIGURATION_TYPES AND NOT CMAKE_BUILD_TYPE)
|
||||||
|
set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
|
||||||
|
endif()
|
||||||
|
|
||||||
option(USE_FRAMEBUFFER "Use Linux framebuffer instead of SDL2" OFF)
|
option(USE_FRAMEBUFFER "Use Linux framebuffer instead of SDL2" OFF)
|
||||||
|
|
||||||
set(SOURCES
|
set(SOURCES
|
||||||
|
|
|
||||||
21
README.md
21
README.md
|
|
@ -115,6 +115,8 @@ cmake -B build-arm-fb \
|
||||||
cmake --build build-arm-fb
|
cmake --build build-arm-fb
|
||||||
```
|
```
|
||||||
|
|
||||||
|
说明:单配置生成器(Makefile/Ninja)默认使用 `Release` 构建;ARM / framebuffer 性能测试必须确认 `CMAKE_BUILD_TYPE=Release`,否则逐像素绘制和 `/dev/fb0` 提交会因未优化构建出现数量级偏差。
|
||||||
|
|
||||||
构建(SDL2 后端,要求工具链/sysroot 可找到目标板 SDL2 开发库):
|
构建(SDL2 后端,要求工具链/sysroot 可找到目标板 SDL2 开发库):
|
||||||
```bash
|
```bash
|
||||||
cmake -B build-arm-sdl \
|
cmake -B build-arm-sdl \
|
||||||
|
|
@ -277,6 +279,21 @@ IMX6U-Game/
|
||||||
- **FBDisplay**:`/dev/fb0` 对照后端,用于极简显示通路验证
|
- **FBDisplay**:`/dev/fb0` 对照后端,用于极简显示通路验证
|
||||||
- **ITimeSource / SteadyTimeSource**:独立时间源接口与单调时钟实现;Linux/IMX6U 使用 `clock_gettime(CLOCK_MONOTONIC)`,Windows 使用 `std::chrono::steady_clock`,Display 不再承担计时职责
|
- **ITimeSource / SteadyTimeSource**:独立时间源接口与单调时钟实现;Linux/IMX6U 使用 `clock_gettime(CLOCK_MONOTONIC)`,Windows 使用 `std::chrono::steady_clock`,Display 不再承担计时职责
|
||||||
|
|
||||||
|
### Framebuffer 性能说明
|
||||||
|
|
||||||
|
`FBDisplay` 是直接写 `/dev/fb0` 的对照后端。当前实现会从 CPU 侧 `FrameBuffer` 提交到系统 framebuffer,并针对常见像素格式提供快速路径:
|
||||||
|
|
||||||
|
- RGB565:使用专用 RGBA -> RGB565 转换;
|
||||||
|
- ARGB8888 / XRGB8888 类 32bpp:使用专用通道重排;
|
||||||
|
- RGBA8888 且行宽连续时:整块 `memcpy`。
|
||||||
|
|
||||||
|
板端性能测试必须使用 Release 构建。一次测试中,未优化 ARM 构建曾导致 `Frame:81ms / Present:69ms`,开启 Release 后同一轻量 2D demo 可达到约 76 FPS。该结果说明 `/dev/fb0` 整屏提交仍是关键热点,但构建类型会极大影响结论。后续优化优先级:
|
||||||
|
|
||||||
|
1. 直接使用与目标 fb0 一致的 backbuffer 像素格式,例如 RGB565,减少提交时转换;
|
||||||
|
2. 2D 场景使用 dirty rect / 局部提交,避免每帧整屏写入;
|
||||||
|
3. 避免无 3D 内容时清理 depth buffer;
|
||||||
|
4. 对 tile/sprite 增加不透明行拷贝、预转换资源或专用批处理路径。
|
||||||
|
|
||||||
## 当前状态与后续
|
## 当前状态与后续
|
||||||
|
|
||||||
**已完成:**
|
**已完成:**
|
||||||
|
|
@ -284,13 +301,15 @@ IMX6U-Game/
|
||||||
- 双平台显示后端(SDL2 / Framebuffer)
|
- 双平台显示后端(SDL2 / Framebuffer)
|
||||||
- 离线资源转换工具:PNG sprite -> C++ 头文件,像素字体 -> bitmap atlas/header
|
- 离线资源转换工具:PNG sprite -> C++ 头文件,像素字体 -> bitmap atlas/header
|
||||||
- 基础 2D sprite、SpriteRegion、bitmap font 文本绘制和 tilemap 视口绘制,当前 demo 显示 FPS 文本、测试 sprite 和小型滚动 tilemap
|
- 基础 2D sprite、SpriteRegion、bitmap font 文本绘制和 tilemap 视口绘制,当前 demo 显示 FPS 文本、测试 sprite 和小型滚动 tilemap
|
||||||
|
- 当前板端性能 demo 已临时移除旋转正方体,只保留 2D sprite/tilemap/FPS 与 `Frame/Present` 耗时显示,用于测量 framebuffer 提交瓶颈
|
||||||
- Gfx 目录规范化,代码收敛到 `src/Gfx/`
|
- Gfx 目录规范化,代码收敛到 `src/Gfx/`
|
||||||
- `Gfx::DrawContext` 统一绘制入口,封装现有绘制能力
|
- `Gfx::DrawContext` 统一绘制入口,封装现有绘制能力
|
||||||
|
- `DrawContext::clear_color()` 支持只清颜色缓冲,避免 2D-only demo 每帧无意义清 depth buffer
|
||||||
- C++11 兼容代码
|
- C++11 兼容代码
|
||||||
- CMake 跨平台构建
|
- CMake 跨平台构建
|
||||||
|
|
||||||
**待完成(按优先级):**
|
**待完成(按优先级):**
|
||||||
1. FrameBuffer 性能优化(`memset` 清屏、去掉 `at()`、定点数/NEON)
|
1. FrameBuffer / FBDisplay 性能优化(目标像素格式 backbuffer、dirty rect、专用 tile/sprite 快路径、NEON)
|
||||||
2. 应用层拆分(Launcher / GameA / GameB / Shared)和统一 `IApp` 主循环
|
2. 应用层拆分(Launcher / GameA / GameB / Shared)和统一 `IApp` 主循环
|
||||||
3. SDL2 输入抽象(键盘/触摸/按键状态快照)
|
3. SDL2 输入抽象(键盘/触摸/按键状态快照)
|
||||||
4. Gfx 基础 2D 绘制接口(矩形、四边形、继续完善 Sprite/Text/Tilemap 的批处理和专用快路径)
|
4. Gfx 基础 2D 绘制接口(矩形、四边形、继续完善 Sprite/Text/Tilemap 的批处理和专用快路径)
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,7 @@ IMX6U-Game/
|
||||||
│ │ ├─ Platform/ # SDL2 / fb0 显示适配与独立时间源(✅ 已实现)
|
│ │ ├─ Platform/ # SDL2 / fb0 显示适配与独立时间源(✅ 已实现)
|
||||||
│ │ └─ Asset/ # 资源加载(✅ 已实现)
|
│ │ └─ Asset/ # 资源加载(✅ 已实现)
|
||||||
│ ├─ Apps/
|
│ ├─ Apps/
|
||||||
│ │ ├─ Demo/ # 当前 3D 立方体 demo(✅ 已实现)
|
│ │ ├─ Demo/ # 当前板端性能 demo;历史上也用于 3D 立方体验证(✅ 已实现)
|
||||||
│ │ ├─ Launcher/ # 启动器应用(待实现)
|
│ │ ├─ Launcher/ # 启动器应用(待实现)
|
||||||
│ │ ├─ GameA/ # 第一个游戏(待实现)
|
│ │ ├─ GameA/ # 第一个游戏(待实现)
|
||||||
│ │ └─ GameB/ # 第二个游戏(待实现)
|
│ │ └─ GameB/ # 第二个游戏(待实现)
|
||||||
|
|
@ -226,7 +226,7 @@ namespace Gfx
|
||||||
4. ~~底层代码迁移到 `src/Gfx/`,Demo 入口迁移到 `src/Apps/Demo/`。~~ **已完成**
|
4. ~~底层代码迁移到 `src/Gfx/`,Demo 入口迁移到 `src/Apps/Demo/`。~~ **已完成**
|
||||||
5. 新增 Launcher app,只做最小菜单和应用切换。
|
5. 新增 Launcher app,只做最小菜单和应用切换。
|
||||||
6. 新增 GameA/GameB 空壳,验证三应用切换。
|
6. 新增 GameA/GameB 空壳,验证三应用切换。
|
||||||
7. 再逐步把现有 3D demo 或 2D 游戏逻辑迁入对应 Game 目录。
|
7. 再逐步把现有 3D demo 能力恢复为独立验证入口,或把 2D 游戏逻辑迁入对应 Game 目录。
|
||||||
8. 最后重构 CMake,按 `imx6u_gfx` + 应用 target 拆分。
|
8. 最后重构 CMake,按 `imx6u_gfx` + 应用 target 拆分。
|
||||||
|
|
||||||
## 8. 性能注意事项
|
## 8. 性能注意事项
|
||||||
|
|
@ -237,6 +237,9 @@ namespace Gfx
|
||||||
- Launcher 不应常驻消耗大量纹理/音频资源;进入游戏后可释放非必要启动器资源。
|
- Launcher 不应常驻消耗大量纹理/音频资源;进入游戏后可释放非必要启动器资源。
|
||||||
- Gfx 的绘制函数要保持小而直接,优先内联和连续内存写入。
|
- Gfx 的绘制函数要保持小而直接,优先内联和连续内存写入。
|
||||||
- UI 控件层可以面向对象;像素/quad/sprite/tile 绘制层不要过度抽象。
|
- UI 控件层可以面向对象;像素/quad/sprite/tile 绘制层不要过度抽象。
|
||||||
|
- 无 3D 内容的 2D 应用应使用只清颜色缓冲的路径,避免每帧清理 depth buffer。
|
||||||
|
- `/dev/fb0` 后端提交可能是主要瓶颈;板端性能分析应拆分 `Frame` 和 `Present` 耗时,并确认 ARM 构建为 Release。
|
||||||
|
- 直接写 framebuffer 不是原子换屏,LCD 扫描会让局部动画看起来比完整帧率更连续;判断性能应以计时数据为准。
|
||||||
|
|
||||||
## 9. 资源转换约定
|
## 9. 资源转换约定
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -87,11 +87,31 @@
|
||||||
## 5. 渲染管线性能规范
|
## 5. 渲染管线性能规范
|
||||||
|
|
||||||
- FrameBuffer / DepthBuffer 清理必须优先考虑批量填充(如 `memset`、`std::fill`、平台优化路径),不要逐像素走复杂逻辑。
|
- FrameBuffer / DepthBuffer 清理必须优先考虑批量填充(如 `memset`、`std::fill`、平台优化路径),不要逐像素走复杂逻辑。
|
||||||
|
- 2D-only 或不使用深度测试的场景应只清颜色缓冲,例如使用 `DrawContext::clear_color()`;不要每帧无意义清理 depth buffer。
|
||||||
- 像素写入路径应尽量减少分支和函数调用层级。
|
- 像素写入路径应尽量减少分支和函数调用层级。
|
||||||
|
- 像素级写入的 Release 快路径不要使用带额外边界检查的容器访问(例如 `std::vector::at()`);边界检查应在外层完成。
|
||||||
- 裁剪、剔除、包围盒收缩要尽早执行,避免把不可见数据送入像素级循环。
|
- 裁剪、剔除、包围盒收缩要尽早执行,避免把不可见数据送入像素级循环。
|
||||||
- 三角形属性插值、深度测试、纹理采样等未来功能必须先定义定点/整数方案,再接入热路径。
|
- 三角形属性插值、深度测试、纹理采样等未来功能必须先定义定点/整数方案,再接入热路径。
|
||||||
- PC 调试版可以保留更易读的检查与可视化代码,但 ARM release 路径必须能关闭这些额外成本。
|
- PC 调试版可以保留更易读的检查与可视化代码,但 ARM release 路径必须能关闭这些额外成本。
|
||||||
|
|
||||||
|
### 5.1 `/dev/fb0` 提交路径
|
||||||
|
|
||||||
|
Framebuffer 后端是 IMX6U 上最容易误判性能的路径。`FBDisplay::present()` 需要把 CPU 侧 framebuffer 写入 `/dev/fb0`,如果逐像素走通用格式转换,整屏 1024x600 提交会成为主瓶颈。
|
||||||
|
|
||||||
|
规范:
|
||||||
|
|
||||||
|
- ARM / framebuffer 性能测试必须使用 Release 构建;单配置生成器应确认 `CMAKE_BUILD_TYPE=Release`。
|
||||||
|
- 性能结论必须拆分 `Frame` 和 `Present` 耗时;如果 `Present` 接近 `Frame`,优先优化显示提交,而不是游戏逻辑。
|
||||||
|
- fb0 像素格式应在初始化时打印并据此走专用路径,常见格式包括 RGB565、ARGB8888/XRGB8888、RGBA8888。
|
||||||
|
- 避免每帧重复做不必要的通用 RGBA 转换;可考虑目标格式 backbuffer、预转换资源、行拷贝、dirty rect 和局部提交。
|
||||||
|
- 直接写 `/dev/fb0` 不是原子换屏;LCD 控制器可能边扫描边显示正在写入的内存,因此肉眼流畅度不等同于完整帧率。
|
||||||
|
|
||||||
|
已观察到的板端测试结论:
|
||||||
|
|
||||||
|
- 未优化 ARM 构建下,轻量 2D demo 曾出现约 `Frame:81ms / Present:69ms`。
|
||||||
|
- 改为 Release 构建后,同一类 framebuffer 测试最高约 76 FPS。
|
||||||
|
- 因此板端性能测试首先检查构建类型,再判断算法或硬件瓶颈。
|
||||||
|
|
||||||
## 6. STL 与标准库使用边界
|
## 6. STL 与标准库使用边界
|
||||||
|
|
||||||
允许:
|
允许:
|
||||||
|
|
@ -129,6 +149,7 @@
|
||||||
|
|
||||||
新增或修改核心代码前,至少检查:
|
新增或修改核心代码前,至少检查:
|
||||||
|
|
||||||
|
- [ ] ARM / framebuffer 性能测试是否确认使用 Release 构建?
|
||||||
- [ ] 是否在热路径新增了 `float` / `double`?如果是,是否能改成整数/定点?
|
- [ ] 是否在热路径新增了 `float` / `double`?如果是,是否能改成整数/定点?
|
||||||
- [ ] 是否在每帧或内层循环创建了 `std::vector` / `std::string` / 其他堆分配对象?
|
- [ ] 是否在每帧或内层循环创建了 `std::vector` / `std::string` / 其他堆分配对象?
|
||||||
- [ ] 容器是否提前 `reserve()`,或由上层复用?
|
- [ ] 容器是否提前 `reserve()`,或由上层复用?
|
||||||
|
|
@ -214,3 +235,11 @@
|
||||||
- 总帧时间、最低 FPS、峰值帧时间
|
- 总帧时间、最低 FPS、峰值帧时间
|
||||||
|
|
||||||
性能日志默认低频输出,例如每 60 帧汇总一次;ARM release 中不得逐帧大量打印。
|
性能日志默认低频输出,例如每 60 帧汇总一次;ARM release 中不得逐帧大量打印。
|
||||||
|
|
||||||
|
Framebuffer 对照后端同样需要至少显示或记录:
|
||||||
|
|
||||||
|
- `Frame`:从帧开始到提交完成的总耗时;
|
||||||
|
- `Present`:`IDisplay::present()` 的耗时;
|
||||||
|
- 当前 FPS。
|
||||||
|
|
||||||
|
当前 Demo 的板端测试 UI 已包含这些信息,后续正式 profiler 可替代该临时显示。
|
||||||
|
|
|
||||||
|
|
@ -5,17 +5,9 @@
|
||||||
#include <cstring>
|
#include <cstring>
|
||||||
#include <thread>
|
#include <thread>
|
||||||
#include <chrono>
|
#include <chrono>
|
||||||
#include "Vector2.h"
|
|
||||||
#include "Vector3.h"
|
|
||||||
#include "Vector4.h"
|
|
||||||
#include "Matrix4x4.h"
|
|
||||||
#include "MathUtil.h"
|
|
||||||
#include "Color.h"
|
#include "Color.h"
|
||||||
#include "Triangle.h"
|
|
||||||
#include "Camera.h"
|
|
||||||
#include <cstdlib>
|
#include <cstdlib>
|
||||||
#include "Timer.h"
|
#include "Timer.h"
|
||||||
#include "Vertex.h"
|
|
||||||
#include "DrawContext.h"
|
#include "DrawContext.h"
|
||||||
#include "test_sprite.h"
|
#include "test_sprite.h"
|
||||||
#include "font_atlas.h"
|
#include "font_atlas.h"
|
||||||
|
|
@ -28,86 +20,9 @@
|
||||||
#include "SDLDisplay.h"
|
#include "SDLDisplay.h"
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
const int32_t width = 800;
|
const int32_t width = 1024;
|
||||||
const int32_t height = 600;
|
const int32_t height = 600;
|
||||||
|
|
||||||
struct ProjectedVertex
|
|
||||||
{
|
|
||||||
Math::Vector3 screen;
|
|
||||||
bool visible = false;
|
|
||||||
};
|
|
||||||
|
|
||||||
struct CubeFace
|
|
||||||
{
|
|
||||||
std::array<int, 4> vertices;
|
|
||||||
};
|
|
||||||
|
|
||||||
struct CubeTriangle
|
|
||||||
{
|
|
||||||
std::array<int, 3> vertices;
|
|
||||||
};
|
|
||||||
|
|
||||||
static ProjectedVertex ProjectToScreen(
|
|
||||||
const Math::Vector3 &vertex,
|
|
||||||
const Math::Matrix4x4 &mvp,
|
|
||||||
const Math::Matrix4x4 &viewport)
|
|
||||||
{
|
|
||||||
using namespace Math;
|
|
||||||
|
|
||||||
const Vector4 clip = mvp * Vector4::Point(vertex);
|
|
||||||
if (std::abs(clip.w) < 1e-5f)
|
|
||||||
{
|
|
||||||
return {};
|
|
||||||
}
|
|
||||||
|
|
||||||
const float invW = 1.0f / clip.w;
|
|
||||||
const float ndcX = clip.x * invW;
|
|
||||||
const float ndcY = clip.y * invW;
|
|
||||||
const float ndcZ = clip.z * invW;
|
|
||||||
|
|
||||||
if (ndcX < -1.0f || ndcX > 1.0f || ndcY < -1.0f || ndcY > 1.0f || ndcZ < -1.0f || ndcZ > 1.0f)
|
|
||||||
{
|
|
||||||
return {};
|
|
||||||
}
|
|
||||||
|
|
||||||
const Vector4 screen = viewport * Vector4(ndcX, ndcY, ndcZ, 1.0f);
|
|
||||||
ProjectedVertex result;
|
|
||||||
result.screen = Vector3(screen.x, screen.y, screen.z);
|
|
||||||
result.visible = true;
|
|
||||||
return result;
|
|
||||||
}
|
|
||||||
|
|
||||||
static bool IsFaceVisible(const CubeFace &face, const std::array<Math::Vector3, 8> &viewSpaceVertices)
|
|
||||||
{
|
|
||||||
using namespace Math;
|
|
||||||
|
|
||||||
const Vector3 &v0 = viewSpaceVertices[face.vertices[0]];
|
|
||||||
const Vector3 &v1 = viewSpaceVertices[face.vertices[1]];
|
|
||||||
const Vector3 &v2 = viewSpaceVertices[face.vertices[2]];
|
|
||||||
const Vector3 faceNormal = (v1 - v0).cross(v2 - v0);
|
|
||||||
const Vector3 faceCenter =
|
|
||||||
(viewSpaceVertices[face.vertices[0]] +
|
|
||||||
viewSpaceVertices[face.vertices[1]] +
|
|
||||||
viewSpaceVertices[face.vertices[2]] +
|
|
||||||
viewSpaceVertices[face.vertices[3]]) /
|
|
||||||
4.0f;
|
|
||||||
|
|
||||||
return faceNormal.dot(faceCenter) > 0.0f;
|
|
||||||
}
|
|
||||||
|
|
||||||
static bool IsTriangleVisible(const CubeTriangle &triangle, const std::array<Math::Vector3, 8> &viewSpaceVertices)
|
|
||||||
{
|
|
||||||
using namespace Math;
|
|
||||||
|
|
||||||
const Vector3 &v0 = viewSpaceVertices[triangle.vertices[0]];
|
|
||||||
const Vector3 &v1 = viewSpaceVertices[triangle.vertices[1]];
|
|
||||||
const Vector3 &v2 = viewSpaceVertices[triangle.vertices[2]];
|
|
||||||
const Vector3 faceNormal = (v1 - v0).cross(v2 - v0);
|
|
||||||
const Vector3 faceCenter = (v0 + v1 + v2) / 3.0f;
|
|
||||||
|
|
||||||
return faceNormal.dot(faceCenter) > 0.0f;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void PrintUsage(const char *program_name)
|
static void PrintUsage(const char *program_name)
|
||||||
{
|
{
|
||||||
std::cout
|
std::cout
|
||||||
|
|
@ -227,142 +142,33 @@ int main(int argc, char *argv[])
|
||||||
RenderData::Tilemap::EmptyTile, 0, RenderData::Tilemap::EmptyTile, 0, RenderData::Tilemap::EmptyTile, 0, RenderData::Tilemap::EmptyTile, 0};
|
RenderData::Tilemap::EmptyTile, 0, RenderData::Tilemap::EmptyTile, 0, RenderData::Tilemap::EmptyTile, 0, RenderData::Tilemap::EmptyTile, 0};
|
||||||
RenderData::Tilemap testTilemap(tileIds.data(), 8, 4, &sprite_img, sprite_img.width, sprite_img.height, 1);
|
RenderData::Tilemap testTilemap(tileIds.data(), 8, 4, &sprite_img, sprite_img.width, sprite_img.height, 1);
|
||||||
|
|
||||||
Scene::Camera camera;
|
|
||||||
camera.transform.position = Math::Vector3(0.0f, 0.0f, 3.0f);
|
|
||||||
camera.transform.rotation = Math::Vector3(0.0f, 3.1415926535f, 0.0f);
|
|
||||||
|
|
||||||
const std::array<Math::Vector3, 8> cubeVertices = {
|
|
||||||
Math::Vector3(-0.5f, -0.5f, -0.5f),
|
|
||||||
Math::Vector3(0.5f, -0.5f, -0.5f),
|
|
||||||
Math::Vector3(0.5f, 0.5f, -0.5f),
|
|
||||||
Math::Vector3(-0.5f, 0.5f, -0.5f),
|
|
||||||
Math::Vector3(-0.5f, -0.5f, 0.5f),
|
|
||||||
Math::Vector3(0.5f, -0.5f, 0.5f),
|
|
||||||
Math::Vector3(0.5f, 0.5f, 0.5f),
|
|
||||||
Math::Vector3(-0.5f, 0.5f, 0.5f)};
|
|
||||||
|
|
||||||
const std::array<CubeFace, 6> cubeFaces = {
|
|
||||||
CubeFace{{0, 3, 2, 1}},
|
|
||||||
CubeFace{{4, 5, 6, 7}},
|
|
||||||
CubeFace{{0, 4, 7, 3}},
|
|
||||||
CubeFace{{1, 2, 6, 5}},
|
|
||||||
CubeFace{{0, 1, 5, 4}},
|
|
||||||
CubeFace{{3, 7, 6, 2}}};
|
|
||||||
const std::array<CubeTriangle, 12> cubeTriangles = {
|
|
||||||
CubeTriangle{{0, 3, 2}}, CubeTriangle{{0, 2, 1}},
|
|
||||||
CubeTriangle{{4, 5, 6}}, CubeTriangle{{4, 6, 7}},
|
|
||||||
CubeTriangle{{0, 4, 7}}, CubeTriangle{{0, 7, 3}},
|
|
||||||
CubeTriangle{{1, 2, 6}}, CubeTriangle{{1, 6, 5}},
|
|
||||||
CubeTriangle{{0, 1, 5}}, CubeTriangle{{0, 5, 4}},
|
|
||||||
CubeTriangle{{3, 7, 6}}, CubeTriangle{{3, 6, 2}}};
|
|
||||||
|
|
||||||
const RenderData::Color clearColor(18, 18, 24, 255);
|
const RenderData::Color clearColor(18, 18, 24, 255);
|
||||||
const RenderData::Color cubeColor(240, 240, 240, 255);
|
|
||||||
const RenderData::Color fpsColor(0, 255, 80, 255);
|
const RenderData::Color fpsColor(0, 255, 80, 255);
|
||||||
const RenderData::Color fpsBg(0, 0, 0, 200);
|
const RenderData::Color fpsBg(0, 0, 0, 200);
|
||||||
const float aspectRatio = static_cast<float>(width) / static_cast<float>(height);
|
|
||||||
|
|
||||||
int32_t fps = 0;
|
int32_t fps = 0;
|
||||||
int32_t frame_count = 0;
|
int32_t frame_count = 0;
|
||||||
uint32_t last_fps_time = 0;
|
uint32_t last_fps_time = 0;
|
||||||
char fps_text[32];
|
char fps_text[32];
|
||||||
|
char perf_text[64];
|
||||||
|
uint32_t last_frame_ms = 0;
|
||||||
|
uint32_t last_present_ms = 0;
|
||||||
|
|
||||||
bool isRunning = true;
|
bool isRunning = true;
|
||||||
uint32_t animation_time_ms = 0;
|
|
||||||
while (isRunning)
|
while (isRunning)
|
||||||
{
|
{
|
||||||
timer.begin_frame(time_source.get_time_ms());
|
const uint32_t frame_start_ms = time_source.get_time_ms();
|
||||||
const uint32_t fixed_delta_ms = timer.fixed_delta_ms();
|
timer.begin_frame(frame_start_ms);
|
||||||
animation_time_ms += fixed_delta_ms;
|
|
||||||
|
|
||||||
display->poll_events(isRunning);
|
display->poll_events(isRunning);
|
||||||
|
|
||||||
ctx.clear(clearColor);
|
ctx.clear_color(clearColor);
|
||||||
|
|
||||||
const float animation_time = static_cast<float>(animation_time_ms) / 1000.0f;
|
|
||||||
const Math::Matrix4x4 model =
|
|
||||||
Math::MathUtil::get_rotation_matrix_y(animation_time) *
|
|
||||||
Math::MathUtil::get_rotation_matrix_x(static_cast<float>(animation_time_ms * 6u) / 10000.0f);
|
|
||||||
const Math::Matrix4x4 view = camera.get_view_matrix();
|
|
||||||
const Math::Matrix4x4 modelView = view * model;
|
|
||||||
const Math::Matrix4x4 projection = camera.get_perspective_projection_matrix(aspectRatio);
|
|
||||||
const Math::Matrix4x4 viewport = camera.get_viewport_matrix(static_cast<float>(width), static_cast<float>(height));
|
|
||||||
const Math::Matrix4x4 mvp = projection * modelView;
|
|
||||||
|
|
||||||
std::array<Math::Vector3, 8> viewSpaceVertices;
|
|
||||||
std::array<ProjectedVertex, 8> projectedVertices;
|
|
||||||
for (size_t i = 0; i < cubeVertices.size(); ++i)
|
|
||||||
{
|
|
||||||
viewSpaceVertices[i] = (modelView * Math::Vector4::Point(cubeVertices[i])).to_vector3();
|
|
||||||
projectedVertices[i] = ProjectToScreen(cubeVertices[i], mvp, viewport);
|
|
||||||
}
|
|
||||||
|
|
||||||
std::array<bool, 6> visibleFaces = {};
|
|
||||||
for (size_t faceIndex = 0; faceIndex < cubeFaces.size(); ++faceIndex)
|
|
||||||
{
|
|
||||||
visibleFaces[faceIndex] = IsFaceVisible(cubeFaces[faceIndex], viewSpaceVertices);
|
|
||||||
}
|
|
||||||
|
|
||||||
std::array<RenderData::Triangle, 12> drawTriangles;
|
|
||||||
size_t drawCommandCount = 0;
|
|
||||||
for (const CubeTriangle &cubeTriangle : cubeTriangles)
|
|
||||||
{
|
|
||||||
if (!IsTriangleVisible(cubeTriangle, viewSpaceVertices))
|
|
||||||
{
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
const ProjectedVertex &v0 = projectedVertices[cubeTriangle.vertices[0]];
|
|
||||||
const ProjectedVertex &v1 = projectedVertices[cubeTriangle.vertices[1]];
|
|
||||||
const ProjectedVertex &v2 = projectedVertices[cubeTriangle.vertices[2]];
|
|
||||||
if (!v0.visible || !v1.visible || !v2.visible)
|
|
||||||
{
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
drawTriangles[drawCommandCount++] =
|
|
||||||
RenderData::Triangle(
|
|
||||||
Scene::Vertex(v0.screen),
|
|
||||||
Scene::Vertex(v1.screen),
|
|
||||||
Scene::Vertex(v2.screen));
|
|
||||||
}
|
|
||||||
|
|
||||||
for (size_t i = 0; i < drawCommandCount; ++i)
|
|
||||||
{
|
|
||||||
ctx.draw_triangle(drawTriangles[i], cubeColor);
|
|
||||||
}
|
|
||||||
|
|
||||||
for (size_t faceIndex = 0; faceIndex < cubeFaces.size(); ++faceIndex)
|
|
||||||
{
|
|
||||||
if (!visibleFaces[faceIndex])
|
|
||||||
{
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
const CubeFace &face = cubeFaces[faceIndex];
|
|
||||||
for (size_t edgeOffset = 0; edgeOffset < face.vertices.size(); ++edgeOffset)
|
|
||||||
{
|
|
||||||
const int startIndex = face.vertices[edgeOffset];
|
|
||||||
const int endIndex = face.vertices[(edgeOffset + 1) % face.vertices.size()];
|
|
||||||
const ProjectedVertex &start = projectedVertices[startIndex];
|
|
||||||
const ProjectedVertex &end = projectedVertices[endIndex];
|
|
||||||
if (!start.visible || !end.visible)
|
|
||||||
{
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
ctx.draw_line(
|
|
||||||
Math::Vector2(start.screen.x, start.screen.y).to_vector2Int(),
|
|
||||||
Math::Vector2(end.screen.x, end.screen.y).to_vector2Int(),
|
|
||||||
clearColor);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// sprite 测试
|
// sprite 测试
|
||||||
ctx.draw_sprite(10, 10, sprite_img);
|
ctx.draw_sprite(10, 10, sprite_img);
|
||||||
ctx.draw_sprite_region_ex(30, 10, sprite_region, 2, false, false);
|
ctx.draw_sprite_region_ex(30, 10, sprite_region, 2, false, false);
|
||||||
ctx.draw_sprite_region_ex(10, 30, sprite_region, 3, true, false);
|
ctx.draw_sprite_region_ex(10, 30, sprite_region, 3, true, false);
|
||||||
ctx.draw_tilemap(testTilemap, 650, 500, 96, 48, static_cast<int32_t>(animation_time_ms / 20u) % 32, 0);
|
ctx.draw_tilemap(testTilemap, 650, 500, 96, 48, static_cast<int32_t>(frame_start_ms / 20u) % 32, 0);
|
||||||
|
|
||||||
// FPS 计数
|
// FPS 计数
|
||||||
++frame_count;
|
++frame_count;
|
||||||
|
|
@ -375,8 +181,13 @@ int main(int argc, char *argv[])
|
||||||
}
|
}
|
||||||
std::snprintf(fps_text, sizeof(fps_text), "FPS: %d", fps);
|
std::snprintf(fps_text, sizeof(fps_text), "FPS: %d", fps);
|
||||||
ctx.draw_text(font, 4, 4, fpsColor, fpsBg, fps_text);
|
ctx.draw_text(font, 4, 4, fpsColor, fpsBg, fps_text);
|
||||||
|
std::snprintf(perf_text, sizeof(perf_text), "Frame:%ums Present:%ums", last_frame_ms, last_present_ms);
|
||||||
|
ctx.draw_text(font, 4, 4 + font.char_h, fpsColor, fpsBg, perf_text);
|
||||||
|
|
||||||
|
const uint32_t present_start_ms = time_source.get_time_ms();
|
||||||
ctx.present(display);
|
ctx.present(display);
|
||||||
|
last_present_ms = time_source.get_time_ms() - present_start_ms;
|
||||||
|
last_frame_ms = time_source.get_time_ms() - frame_start_ms;
|
||||||
SleepRemainingFrameTime(timer, time_source);
|
SleepRemainingFrameTime(timer, time_source);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -3,6 +3,7 @@
|
||||||
#include <vector>
|
#include <vector>
|
||||||
#include "Vector2.h"
|
#include "Vector2.h"
|
||||||
#include <cmath>
|
#include <cmath>
|
||||||
|
#include <cstddef>
|
||||||
|
|
||||||
namespace Core
|
namespace Core
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -18,6 +18,6 @@ namespace Core
|
||||||
}
|
}
|
||||||
// Row-major layout with y = 0 on the first row, matching a top-left screen origin.
|
// Row-major layout with y = 0 on the first row, matching a top-left screen origin.
|
||||||
size_t index = static_cast<size_t>(y) * width + x;
|
size_t index = static_cast<size_t>(y) * width + x;
|
||||||
buffer.at(index) = color;
|
buffer[index] = color;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,7 @@
|
||||||
#pragma once
|
#pragma once
|
||||||
#include "Color.h"
|
#include "Color.h"
|
||||||
#include "Vector2.h"
|
#include "Vector2.h"
|
||||||
|
#include <cstddef>
|
||||||
#include <cstdint>
|
#include <cstdint>
|
||||||
#include <vector>
|
#include <vector>
|
||||||
|
|
||||||
|
|
@ -43,4 +44,4 @@ namespace Core
|
||||||
|
|
||||||
void set_pixel(const int32_t x, const int32_t y, const uint32_t color);
|
void set_pixel(const int32_t x, const int32_t y, const uint32_t color);
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -40,6 +40,11 @@ namespace Gfx
|
||||||
depthBuffer->clear();
|
depthBuffer->clear();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void DrawContext::clear_color(const RenderData::Color& color)
|
||||||
|
{
|
||||||
|
frameBuffer->clear(color);
|
||||||
|
}
|
||||||
|
|
||||||
void DrawContext::clear_depth()
|
void DrawContext::clear_depth()
|
||||||
{
|
{
|
||||||
depthBuffer->clear();
|
depthBuffer->clear();
|
||||||
|
|
|
||||||
|
|
@ -46,6 +46,7 @@ namespace Gfx
|
||||||
int32_t get_height() const;
|
int32_t get_height() const;
|
||||||
|
|
||||||
void clear(const RenderData::Color& color);
|
void clear(const RenderData::Color& color);
|
||||||
|
void clear_color(const RenderData::Color& color);
|
||||||
void clear_depth();
|
void clear_depth();
|
||||||
|
|
||||||
void draw_line(const Math::Vector2Int& from, const Math::Vector2Int& to, const RenderData::Color& color);
|
void draw_line(const Math::Vector2Int& from, const Math::Vector2Int& to, const RenderData::Color& color);
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,5 @@
|
||||||
#pragma once
|
#pragma once
|
||||||
|
#include <cstddef>
|
||||||
#include <cmath>
|
#include <cmath>
|
||||||
#include "Vector3.h"
|
#include "Vector3.h"
|
||||||
#include "Vector4.h"
|
#include "Vector4.h"
|
||||||
|
|
|
||||||
|
|
@ -13,6 +13,22 @@
|
||||||
|
|
||||||
namespace Platform
|
namespace Platform
|
||||||
{
|
{
|
||||||
|
namespace
|
||||||
|
{
|
||||||
|
inline uint16_t rgba_to_rgb565(uint32_t rgba)
|
||||||
|
{
|
||||||
|
const uint32_t r = (rgba >> 24) & 0xFFu;
|
||||||
|
const uint32_t g = (rgba >> 16) & 0xFFu;
|
||||||
|
const uint32_t b = (rgba >> 8) & 0xFFu;
|
||||||
|
return static_cast<uint16_t>(((r & 0xF8u) << 8) | ((g & 0xFCu) << 3) | (b >> 3));
|
||||||
|
}
|
||||||
|
|
||||||
|
inline uint32_t rgba_to_argb8888(uint32_t rgba)
|
||||||
|
{
|
||||||
|
return ((rgba >> 8) & 0x00FFFFFFu) | ((rgba & 0xFFu) << 24);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
bool FBDisplay::init(int w, int h)
|
bool FBDisplay::init(int w, int h)
|
||||||
{
|
{
|
||||||
width = w;
|
width = w;
|
||||||
|
|
@ -81,15 +97,65 @@ namespace Platform
|
||||||
return;
|
return;
|
||||||
|
|
||||||
const uint32_t* src = static_cast<const uint32_t*>(framebuffer->get_buffer());
|
const uint32_t* src = static_cast<const uint32_t*>(framebuffer->get_buffer());
|
||||||
int dst_width = std::min(width, static_cast<int>(vinfo.xres));
|
const int src_width = framebuffer->get_width();
|
||||||
int dst_height = std::min(height, static_cast<int>(vinfo.yres));
|
const int dst_width = std::min(src_width, static_cast<int>(vinfo.xres));
|
||||||
int bytes_per_pixel = vinfo.bits_per_pixel / 8;
|
const int dst_height = std::min(framebuffer->get_height(), static_cast<int>(vinfo.yres));
|
||||||
|
const int bytes_per_pixel = vinfo.bits_per_pixel / 8;
|
||||||
|
const bool is_rgb565 =
|
||||||
|
vinfo.bits_per_pixel == 16 &&
|
||||||
|
vinfo.red.offset == 11 && vinfo.red.length == 5 &&
|
||||||
|
vinfo.green.offset == 5 && vinfo.green.length == 6 &&
|
||||||
|
vinfo.blue.offset == 0 && vinfo.blue.length == 5;
|
||||||
|
const bool is_argb8888 =
|
||||||
|
vinfo.bits_per_pixel == 32 &&
|
||||||
|
vinfo.red.offset == 16 && vinfo.red.length == 8 &&
|
||||||
|
vinfo.green.offset == 8 && vinfo.green.length == 8 &&
|
||||||
|
vinfo.blue.offset == 0 && vinfo.blue.length == 8;
|
||||||
|
const bool is_rgba8888 =
|
||||||
|
vinfo.bits_per_pixel == 32 &&
|
||||||
|
vinfo.red.offset == 24 && vinfo.red.length == 8 &&
|
||||||
|
vinfo.green.offset == 16 && vinfo.green.length == 8 &&
|
||||||
|
vinfo.blue.offset == 8 && vinfo.blue.length == 8;
|
||||||
|
|
||||||
|
if (is_rgb565)
|
||||||
|
{
|
||||||
|
for (int y = 0; y < dst_height; ++y)
|
||||||
|
{
|
||||||
|
const uint32_t* src_row = src + y * src_width;
|
||||||
|
uint16_t* dst_row = reinterpret_cast<uint16_t*>(fb_mem + y * finfo.line_length);
|
||||||
|
for (int x = 0; x < dst_width; ++x)
|
||||||
|
{
|
||||||
|
dst_row[x] = rgba_to_rgb565(src_row[x]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (is_argb8888)
|
||||||
|
{
|
||||||
|
for (int y = 0; y < dst_height; ++y)
|
||||||
|
{
|
||||||
|
const uint32_t* src_row = src + y * src_width;
|
||||||
|
uint32_t* dst_row = reinterpret_cast<uint32_t*>(fb_mem + y * finfo.line_length);
|
||||||
|
for (int x = 0; x < dst_width; ++x)
|
||||||
|
{
|
||||||
|
dst_row[x] = rgba_to_argb8888(src_row[x]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (is_rgba8888 && dst_width == src_width && static_cast<int>(finfo.line_length) == dst_width * 4)
|
||||||
|
{
|
||||||
|
std::memcpy(fb_mem, src, static_cast<size_t>(dst_height) * static_cast<size_t>(dst_width) * sizeof(uint32_t));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
for (int y = 0; y < dst_height; ++y)
|
for (int y = 0; y < dst_height; ++y)
|
||||||
{
|
{
|
||||||
for (int x = 0; x < dst_width; ++x)
|
for (int x = 0; x < dst_width; ++x)
|
||||||
{
|
{
|
||||||
uint32_t pixel = convert_pixel(src[y * width + x]);
|
uint32_t pixel = convert_pixel(src[y * src_width + x]);
|
||||||
uint8_t* dst = fb_mem + y * finfo.line_length + x * bytes_per_pixel;
|
uint8_t* dst = fb_mem + y * finfo.line_length + x * bytes_per_pixel;
|
||||||
if (vinfo.bits_per_pixel == 32)
|
if (vinfo.bits_per_pixel == 32)
|
||||||
{
|
{
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue