golang panic recover不生效的一个原因

前言:笔者遇到不确定性panic,暂未确定原因,所以通过recover()暂时屏蔽,并打印信息定位
代码存在无法运行到的地方,查看了recover的规则:“程序首先运行panic,出现故障,此时跳转到包含recover()的defer函数执行,recover捕获panic,此时panic就不继续传递.但是recover之后,程序并不会返回到panic那个点继续执行以后的动作,而是在recover这个点继续执行以后的动作”

所以,如果是通过go f()调用panic,recover放到goroutine外面的话,不在同一个goroutine,自然也就无法reached了

错误示范:

func test() {
    defer func() {
        if err := recover(); err != nil {
            fmt.Println("panic")
        }
    go f()
}

应该:

func test() {
    go func() {
        defer func() {
            if err := recover(); err != nil {
                fmt.Println("panic")
            }
        f()
    }()
}

golang pprot访问debug/pprof报404 page not found的解决办法

这个问题要从net/http/pprof的原理说起,可以看到

func init() {
	http.HandleFunc("/debug/pprof/", Index)
	http.HandleFunc("/debug/pprof/cmdline", Cmdline)
	http.HandleFunc("/debug/pprof/profile", Profile)
	http.HandleFunc("/debug/pprof/symbol", Symbol)
	http.HandleFunc("/debug/pprof/trace", Trace)
}

引入 _ “net/http/pprof”,init函数会添加pprof的路由信息,而如果http注册了其他路由,导致http.HandleFunc失效,也就会造成了404的问题,我使用的是httprouter包

知道这个原理,手动添加上面5条路由,再次尝试,访问http://localhost:8080/debug/pprof/goroutine?debug=1等其他内部页面时,仍旧报404,再次分析原因,httprouter添加路由信息,如第一条,仅添加了/debug/pprof/的路由信息,并不会对子路径作路由牵引,导致404

所以解决办法:

用http.Handle(“/”, router) 将httprouter的路由注册给http路由,http.ListenAndServe(addr, nil) 替代http.ListenAndServe(addr, router),这时,router和http/pprof都可以生效了

http.Handle("/", router)
http.ListenAndServe(addr, nil)

当然还可以用第三方router的prefix功能,如Gorilla的

router.NewRoute().PathPrefix("/debug/pprof/").HandlerFunc(pprof.Index)

httprouter还未发现这样的功能,有了解的可以留言给我

 

DNS协议的axfr和ixfr解析

一、简介

axfr:DNS Zone Transfer Protocol (AXFR),dns的全量更新协议,dns主从架构更新,从向主获取zone的全量数据,由主返回axfr消息,全量刷新该zone的slave信息

ixfr:Incremental Zone Transfer in DNS,dns的增量更新协议,和axfr对应,axfr是一次性将一个zone的全量数据返回至Slave,而ixfr仅将增量更新数据返回

二、扩展内容

提到axfr和ixfr,不得不提到dns的另外一个概念,SOA记录中的序列号,SOA记录本文不作赘述,该序列号用于标识zone的版本信息,常规情况下,zone每发生一次变化,序列号加1,通过序列号标识版本,获取增量更新信息

如上,忽略tcp信令报文,

1,2位notify更新通告响应,表示zone发生变化,master通告slave

3,4为SOA查询响应,slave发起,请求master最新的序列号

10,12为ixfr更新请求和响应,slave发情,请求更新zone信息,axfr同理

 

三、axfr

如上两张图,图一为axfr query,图二为axfr response,query很好理解,对于response,axfr结果放在answers section内,开头和结尾的SOA记录表示左括号和右括号,当中的内容表示这个zone的所有记录

 

四、ixfr

如上,为一个最基本的ixfr请求与响应,与axfr不同的是,ixfr除query type与axfr不同,还会额外在第三section也就是表示权威服务器(NS服务器)区域携带一条slave当前的SOA记录,由master根据slave的序列号(上图红框),判别增量更新信息,并返回给slave

对于应答,上图包含4个SOA,同样在answer section内,开始和结尾的soa含义仍旧和axfr保持一致,代表括号,第二个SOA为老的序列号,后面跟的是需要删除掉的RRs,第三个SOA为较新的序列号,表示需要增加的RRs,下面截取rfc1995中的部分内容,多个版本更新可以顺次将更新串起来,也可以直接经过运算,得到最终增量结果。


上面说了ixfr正常情况的返回结果,那么还存在以下几种情况

  1. ixfr获取增量失败,增量信息不完整或已丢失,直接返回全量结果,answer section同axfr (SOA、records…..、SOA)
  2. ixfr请求序列号和最新序列号一致,answer区域仅返回一条SOA记录
  3. ixfr请求序列号小于最新序列号,但无更新RR,直接返回两条SOA(SOA、SOA)
  4. ixfr请求序列号大于master最新序列号,异常,返回axfr

 

五、文档

rfc1995 IXFR https://www.rfc-editor.org/rfc/rfc1995.txt

https://tools.ietf.org/id/draft-ah-dnsext-rfc1995bis-ixfr-02.html

rfc5936 AXFR https://www.rfc-editor.org/rfc/rfc5936.txt

 

linux ext2、ext3无损升级ext4的方法

英文参考文章:https://www.ghacks.net/2010/08/11/convert-ext23-to-ext4/

注意备份,注意备份,注意备份

勤快的可以看下英文原文

1. 前提条件:
kernel版本2.6.28-11以上,查看方法:

uname -a

2. 停用磁盘:

umount -l /dev/sda1

如果系统磁盘,需要使用liveCD

3. convert
ext2->ext4

tune2fs -O extents,uninit_bg,dir_index,has_journal /dev/sda1

ext3->ext4

tune2fs -O extents,uninit_bg,dir_index /dev/sda1

如果执行失败,提示busy的话,有进程占用,fsck /dev/sda1,如果不可以,注释掉/etc/fstab中这个分区的mount,重启

4. 磁盘检查
e2fsck -pf /dev/sda1
如果报错,看提示,有可能提示手动执行,那么就执行e2fsck -f /dev/sda1,一路y

5. 修改/etc/fstab修改/dev/sda1的格式

6. refresh grub,引用原文,我改的/home目录使用的分区,不用这一步,自己理解这个
Now you need to refresh grub. Depending upon how your boot partition is will determine how you do this. If your boot partition is SEPARATE, do the following:

sudo bash
mkdir /mnt/boot
mount /dev/sda1 /mnt/boot
grub-install /dev/sda –root-directory=/mnt –recheck

If your boot partition is NOT separate, do the following:

sudo bash
mount /dev/sda1 /mnt
grub-install /dev/sda –root-directory=/mnt –recheck

golang报错lfstackPack redeclared in this block解决办法

/usr/local/go/src/runtime/lfstack_amd64.go:16: lfstackPack redeclared in this block
previous declaration at /usr/local/go/src/runtime/lfstack_64bit.go:37
/usr/local/go/src/runtime/lfstack_amd64.go:20: lfstackUnpack redeclared in this block
previous declaration at /usr/local/go/src/runtime/lfstack_64bit.go:41
/usr/local/go/src/runtime/os_linux_generic.go:13: _SS_DISABLE redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:12
/usr/local/go/src/runtime/os_linux_generic.go:14: _NSIG redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:13
/usr/local/go/src/runtime/os_linux_generic.go:15: _SI_USER redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:14
/usr/local/go/src/runtime/os_linux_generic.go:16: _SIG_BLOCK redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:15
/usr/local/go/src/runtime/os_linux_generic.go:17: _SIG_UNBLOCK redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:16
/usr/local/go/src/runtime/os_linux_generic.go:18: _SIG_SETMASK redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:17
/usr/local/go/src/runtime/os_linux_generic.go:19: _RLIMIT_AS redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:18
/usr/local/go/src/runtime/os_linux_generic.go:25: sigset redeclared in this block
previous declaration at /usr/local/go/src/runtime/os2_linux_generic.go:24
/usr/local/go/src/runtime/os_linux_generic.go:25: too many errors

这个错误目前还没发现根本原因,在segment fault找到的解决办法
rm -rf /usr/local/go
后重新解压压缩包
tar -C /usr/local -xzf go$VERSION.$OS-$ARCH.tar.gz

nginx配置ssl证书失败X509_check_private_key:key values mismatch

[root@instance vhost]# /usr/local/nginx/sbin/nginx -s reload
nginx: [emerg] SSL_CTX_use_PrivateKey_file(“/XXX.key”) failed (SSL: error:X:x509 certificate routines:X509_check_private_key:key values mismatch)

顾名思义,私钥和证书不匹配,导致解密失败
第一步先确认是否证书和私钥指定错误,一般申请证书,会得到两个证书,一个是申请到的证书,另外一个是认证机构的证书,指定错了就会报这个错
另外,还需要排除私钥文件中是否有tab空格换行等不可见字符(未修改私钥文件忽略)

netsh.xyz.zone.jnl: create: permission denied,bind9 ixfr同步jnl生成失败的解决办法

调测bind9同步及notify、ixfr、axfr机制,通过rndc reload zone,在slave抓包,并未发现notify及ixfr包,检查named.run日志,发现日志中有netsh.xyz.zone.jnl: create: permission denied
第一反应:权限或者SELinux有问题,经检查不是这的事,搜索,在centos.org找到了解决方案:
Stop Bind Server
service named stop

Move all zones
/var/named/example.com
to
/var/named/data/example.com

and on named.conf
file "data/example.com"

Start Bind Server
service named start

测试,问题解决,原文连接:https://www.centos.org/forums/viewtopic.php?t=5543

__thread,记一次不成功的socket连接

背景:公司某产品自动更新特征库需求,创建临时线程,通过socket建链并下载更新特征库,前期这个需求是同事做的,由于方案问题,做了一部分做不下去了抛给了我
基于openstack的底层重度封装产品,不同头文件引用,暴露到产品层的也有多套socket连接(呵呵)
某某头文件里是:

#ifndef socket
#define socket xx_socket
#endif

又或是某某头文件里面写的:

int socket(int domain, int type, int protocol);

引用到不同头文件,又是截然不同的结果,网络经过封装,底层看不到需要使用的逻辑IP, <sys/socket.h>中的socket行不通,需要使用xx封装的xx_socket,经过尝试后,当调用到xx_socket后发现进程出现段错误复位,调用栈中的函数代码不可见,排查入参无异常,disass反汇编该调用函数和其他已有函数,xx_socket跳转的地址相同,排除引用错误的原因。
会看调用栈,栈顶函数getCompCSI(),获取组件CSI编码,有可能是线程级变量存储,gdb断点组件流程开线程之前,p getCompCSI(),结果返回正常,进入临时线程中继续p,引发复位,问题定位,线程级变量在新开线程没有初始化原因导致。

__thread

引用gcc.gnu.org中的介绍:

5.48 Thread-Local Storage

Thread-local storage (TLS) is a mechanism by which variables are allocated such that there is one instance of the variable per extant thread. The run-time model GCC uses to implement this originates in the IA-64 processor-specific ABI, but has since been migrated to other processors as well. It requires significant support from the linker (ld), dynamic linker (ld.so), and system libraries (libc.so and libpthread.so), so it is not available everywhere.

At the user level, the extension is visible with a new storage class keyword: __thread. For example:

     __thread int i;
     extern __thread struct state s;
     static __thread char *p;

The __thread specifier may be used alone, with the extern or static specifiers, but with no other storage class specifier. When used with extern or static, __thread must appear immediately after the other storage class specifier.

The __thread specifier may be applied to any global, file-scoped static, function-scoped static, or static data member of a class. It may not be applied to block-scoped automatic or non-static data member.

When the address-of operator is applied to a thread-local variable, it is evaluated at run-time and returns the address of the current thread’s instance of that variable. An address so obtained may be used by any thread. When a thread terminates, any pointers to thread-local variables in that thread become invalid.

No static initialization may refer to the address of a thread-local variable.

In C++, if an initializer is present for a thread-local variable, it must be a constant-expression, as defined in 5.19.2 of the ANSI/ISO C++ standard.

一个简单的用例,自己跑下它:

#include <stdio.h>
#include <pthread.h>

__thread int iTestVal = 1;

void* thread1(void *arg)
{
        printf("thread1, iTestVal = %d\n", iTestVal);
        iTestVal = 2;
        sleep(1);
        printf("thread1, after delay iTestVal = %d\n", iTestVal);
}

void* thread2(void *arg)
{
        printf("thread2, iTestVal = %d\n", iTestVal);
        sleep(1);
        printf("thread2, after delay iTestVal = %d\n", iTestVal);
}


int main()
{
        pthread_t pid1, pid2;
        pthread_create(&pid1, NULL, thread1, NULL);
        pthread_create(&pid2, NULL, thread2, NULL);
        pthread_join(pid1, NULL);
        pthread_join(pid2, NULL);

        return 0;
}

结果:

[root@TubbyFlashy-VM test]# gcc -lpthread main.c 
[root@TubbyFlashy-VM test]# ./a.out 
thread2, iTestVal = 1
thread1, iTestVal = 1
thread2, after delay iTestVal = 1
thread1, after delay iTestVal = 2

windows可以ping通ip,但是dns无法解析问题(dns probe possible)的解决办法

今天打开电脑,发现一切网络软件连不上网,打开浏览器报错dns probe possible,通过ping命令ping了谷歌dns 8.8.8.8和阿里dns 223.5.5.5,均可以ping通
手机通过同一个路由器访问网页也正常
尝试了ipconfig /flushdns未果
怀疑有可能是ss把本机dns搞坏了
尝试netsh winsock reset命令,提示重启,重启后正常,Bye