暗黑模式
Setup with Ansible
Quick Start
Prerequisites
- Install deps
bash
# On Control Node
pipx install --include-deps ansible #
pip install passlib #
pipx install rust-just # or sudo apt install just
sudo apt install pwgen # pwgen -s 64 1
1
2
3
4
5
2
3
4
5
- Configure firewall
- Configure DNS settings
ess-helm
- install pg on dedicated server: install pg
- set up database for synapseset up database for mas
- enable passwordconf
# ~/17/data/pg_hba.conf host all all 10.33.12.111/32 scram-sha-256
1
2bash# in postgres user /usr/pgsql-17/bin/pg_ctl reload
1
2 - ess-helm get started
bash
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--flannel-iface=eth1" sh -
cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
1
2
3
4
2
3
4
Config elementWeb/synapse ...
Install
bash
helm upgrade --install --namespace "ess" ess oci://ghcr.io/element-hq/ess-helm/matrix-stack -f ~/ess-config-values/hostnames.yaml -f ~/ess-config-values/tls.yaml -f ~/ess-config-values/postgresql.yaml --wait
1
bash
# rocky linux
dnf install git tar # required for install helm
1
2
2
WARNING
kubectl create namespace ess
: 报错 The connection to the server localhost:8080 was refused
bash
cp /etc/rancher/k3s/k3s.yaml .kube/config
kubectl create namespace ess
1
2
2
如果 helm upgrade
失败,请检查日志:kubectl get events -n ess --sort-by=.metadata.creationTimestamp
可能是自部署的 postgres 无法联通
pg
bash
# /home/postgres/17/data/postgresql.conf
listen_addresses = 'xx.xx.xx.xx'
1
2
2
- 在线配置 pg 参数:https://pgtune.leopard.in.ua/
服务器调整
参考:
- https://www.mongodb.com/zh-cn/docs/manual/reference/ulimit/
- https://github.com/jitsi/jitsi-videobridge/blob/master/config/20-jvb-udp-buffers.conf
# /etc/sysctl.conf
# 文件描述符限制
fs.file-max=100000
# this sets the max, so that we can bump the JVB UDP single port buffer size.
net.core.rmem_max=10485760
net.core.netdev_max_backlog=100000
1
2
3
4
5
6
7
8
2
3
4
5
6
7
8
bash
prlimit --pid 1234 --nofile=100000:200000 # 为进行中的进程修改 max open files
prlimit --pid 1234 --nofile=100000: # 省略 hard limit
1
2
2
bash
sudo sysctl -p
1
# /etc/security/limits.conf
* soft nofile 100000
#(optional)* hard nofile 200000
1
2
3
2
3
bash
less /proc/717594/limits # 查看某个进程的限制
1
xxxx
修改 /etc/security/limits.conf
后,通常 不需要重启整个系统,但需要 重新登录用户会话 才能生效。
如何让修改立即生效
重新登录用户(最简单的方法)
bashexit ssh user@server
1
2或者直接注销并重新登录。
使用
su
切换用户bashsu - username
1这样可以让新的
limits.conf
配置生效。检查是否生效
bashulimit -a
1你可以查看
nofile
、nproc
等参数是否已更新。
特殊情况
- 如果是系统服务(如 Nginx、MySQL),可能需要 重启服务:bash
systemctl restart nginx
1 - 如果是 SSH 连接,可能需要 重启 SSH 服务:bash
systemctl restart sshd
1
如果你希望修改 已运行进程 的 ulimit
,可以使用 prlimit
:
bash
prlimit --pid 1234 --nofile=100000:200000
1
这样可以 动态修改 进程的文件描述符限制,而无需重启系统。
fs.file-max vs nofile
不完全一致,fs.file-max
和 soft max open files
(ulimit -Sn) 代表不同层面的文件描述符限制:
1. fs.file-max
(系统级限制)
- 作用:控制整个 Linux 内核 能够分配的 文件描述符总数(所有进程共享)。
- 查看方式:bash
sysctl -a | grep fs.file-max
1 - 修改方式:bash
sysctl -w fs.file-max=1000000 echo "fs.file-max = 1000000" >> /etc/sysctl.conf sysctl -p
1
2
3 - 影响范围:整个系统,所有进程 共享 这个限制。
2. ulimit -Sn
(进程级软限制)
- 作用:限制 单个进程 可以打开的 最大文件描述符数(受
fs.file-max
影响)。 - 查看方式:bash
ulimit -Sn
1 - 修改方式:bash或者:
ulimit -Sn 100000
1bash添加:sudo nano /etc/security/limits.conf
1plaintext* soft nofile 100000 * hard nofile 200000
1
2 - 影响范围:仅限 当前进程,不会影响整个系统。
3. 关系总结
✅ fs.file-max
是 系统级 限制,决定 整个 Linux 内核 能分配的文件描述符总数。
✅ ulimit -Sn
是 进程级 限制,决定 单个进程 能打开的文件描述符数量。
✅ ulimit -Sn
不能超过 fs.file-max
,否则进程无法打开更多文件。
如果你的服务器需要支持 高并发 WebSocket 或数据库连接,建议同时 提高 fs.file-max
和 ulimit -Sn
,确保进程不会受限!你可以在 这里 和 这里 了解更多详细信息。😊
pg ulimit
这是因为 systemd 启动的服务 不会继承 shell 会话中的 ulimit
设置,而是使用 systemd 自身的资源限制。你需要在 PostgreSQL 的 systemd 配置 中显式设置 LimitNOFILE
。
解决方法
检查 PostgreSQL 进程的文件描述符限制
bashcat /proc/$(pgrep -u postgres -o postgres)/limits | grep "Max open files"
1这会显示 PostgreSQL 进程的实际
max open files
限制。修改 systemd 配置 编辑 PostgreSQL 的 systemd 服务文件:
bashsudo systemctl edit postgresql-17 # rocky sudo systemctl edit postgresql@17-main # ubuntu
1
2添加:
plaintext[Service] LimitNOFILE=100000
1
2重新加载 systemd
bashsudo systemctl daemon-reexec sudo systemctl restart postgresql-17
1
2验证修改是否生效
bashcat /proc/$(pgrep -u postgres -o postgres)/limits | grep "Max open files"
1
为什么 ulimit
不生效?
ulimit -n
只影响 当前 shell 会话,但 systemd 启动的服务不会继承 shell 的ulimit
。LimitNOFILE
是 systemd 级别的资源限制,必须在 service 文件 中显式配置。
增加 k3s 节点
https://docs.k3s.io/zh/quick-start
bash
# server node
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--flannel-iface=eth1" sh -
1
2
2
bash
# on agent(worker) node
hostnamectl set-hostname k3s-worker-1
nano /etc/hosts
reboot
# K3S_TOKEN=/var/lib/rancher/k3s/server/node-token
curl -sfL https://get.k3s.io | K3S_URL=https://x.x.x.x:6443 K3S_TOKEN=xxxx INSTALL_K3S_EXEC="--flannel-iface=eth1" sh -
# on master node
k3s kubectl get nodes -o wide
kubectl label nodes k3s-worker-1 node.type=worker
# on pg servers
vi /var/lib/pgsql/17/data/pg_hba.conf
systemctl reload postgresql-17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2
3
4
5
6
7
8
9
10
11
12
13
14
15
私有网络网卡
参考:搬瓦工 private ip
bash
ip a # 输出的 eth1 10.xxxx 是私有网卡
如果安装 k3s 时没有通过命令参数设置 flannel-iface,那么可以有以下2种方式修改:
### systemd
1
2
3
4
5
2
3
4
5
server: /etc/systemd/system/k3s.service
ExecStart=/usr/local/bin/k3s
server
--flannel-iface=eth1 \
agent: /etc/systemd/system/k3s-agent.service
ExecStart=/usr/local/bin/k3s
agent
--flannel-iface=eth1 \
```bash
sudo systemctl daemon-reexec
sudo systemctl restart k3s[-agent]
systemctl cat k3s #⬇️
ExecStart=/usr/local/bin/k3s \
server \
'--flannel-iface=eth1' \
1
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
traefik config
yml
# /var/lib/rancher/k3s/server/manifests/traefik-custom.yaml
# https://docs.k3s.io/helm#customizing-packaged-components-with-helmchartconfig
# 保存文件之后 k3s 会自动更新
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: traefik
namespace: kube-system
spec:
valuesContent: |-
deployment:
enabled: true
kind: DaemonSet
1
2
3
4
5
6
7
8
9
10
11
12
13
2
3
4
5
6
7
8
9
10
11
12
13
User status
Status | MAS | Synapse |
---|---|---|
locked | User account is temporarily disabled, cannot log in but account data is preserved. Can be unlocked by admin. | User is temporarily suspended, cannot perform actions but account exists. Can be unlocked via admin API. |
deactivated | User account is permanently disabled, all sessions invalidated. Cannot be reactivated through MAS interface. | User account is permanently disabled, removed from rooms, profile cleared. Can only be reactivated via admin API with data loss. |
在开启 MAS 的情况下,Synapse admin deactivate 用户之后如何激活
js
fetch("https://xxxx/_synapse/admin/v2/users/@xxx:yyy.zzz", {
method: "PUT",
headers: {
"Authorization": "Bearer YOUR_ADMIN_TOKEN",
"Content-Type": "application/json"
},
body: JSON.stringify({
deactivated: false
// ⚠️ 不设置 password 字段,适用于 OIDC/MAS 模式
})
})
.then(response => response.json())
.then(data => console.log("Response:", data))
.catch(error => console.error("Error:", error));
1
2
3
4
5
6
7
8
9
10
11
12
13
14
2
3
4
5
6
7
8
9
10
11
12
13
14
迁移数据库
yml
# source pg: /etc/postgresql/17/main/pg_hba.conf
host replication all target-pg-ip/32 scram-sha-256
1
2
2
bash
# k3d server node
systemctl stop k3s
# k3d agent node
systemctl stop k3s-agent
# target pg server
systemctl stop postgresql@17-main
cd /var/lib/postgresql/17/
mv main/ main-bakup
pg_basebackup -h source-ip -p pg-port -U postgres -D ./main -Fp -Xs -P # postgres user has replication permission
# input password of postgres user
chown -R postgres:postgres main
# copy pg_hba.conf postgresql.conf from source to target, and modify if needed
systemctl start postgresql@17-main # target
# source pg server
systemctl stop postgresql@17-main
# k3d server node
systemctl start k3s
# k3d agent node
systemctl start k3s-agent
kubectl scale statefulset -l app.kubernetes.io/component=matrix-server --replicas=0 -n ess
# k3d server node
vi hostnames.yaml
vi postgres.yaml
helm upgrade ...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
yml
media:
# size
maxUploadSize: 2000M
ingress:
#Traefik
className: traefik
annotations:
kubernetes.io/ingress.class: traefik
# Limit
traefik.ingress.kubernetes.io/max-body-size: "2G"
1
2
3
4
5
6
7
8
9
10
11
2
3
4
5
6
7
8
9
10
11
users
sql
SELECT COUNT(DISTINCT user_id)
FROM user_sessions
WHERE last_active_at >= NOW() - INTERVAL '5 days';
SELECT COUNT(*)
FROM devices
WHERE to_timestamp(last_seen / 1000.0) >= NOW() - INTERVAL '24 hours';
SELECT COUNT(DISTINCT user_id)
FROM devices
WHERE to_timestamp(last_seen / 1000.0) >= NOW() - INTERVAL '24 hours';
SELECT COUNT(*), sender
FROM events
WHERE (type = 'm.room.encrypted' OR type = 'm.room.message')
AND origin_server_ts >= DATE_PART('epoch', NOW() - INTERVAL '3 day') * 1000
GROUP BY sender
ORDER BY COUNT(*) DESC
LIMIT 6000;
SELECT COUNT(*) AS user_count
FROM (
SELECT sender
FROM events
WHERE (type = 'm.room.encrypted' OR type = 'm.room.message')
AND origin_server_ts >= DATE_PART('epoch', NOW() - INTERVAL '1 day') * 1000
GROUP BY sender
HAVING COUNT(*) > 10
) AS active_users;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29